VoLuME X MAY, 1915 





ASSOCIATION OF FINGER-PRINTS. 


By H. WAITE, M.A., B.Sc. 


1. Introduction. Certain papers have been published in recent years giving 
the results of research on the variability and correlation of the hand, notably 
(1) “A First Study of the Variability and Correlation of the Hand,” by 
Miss M. A. Whiteley, B.Sc., and Karl Pearson, F.R.S., Proceedings of the Royal 
Society, Vol. 65, pp. 126—151, and (2) “A Second Study of the Variability 
and Correlation of the Hand,” by M. A. Lewenz, B.A., and M. A. Whiteley, B.Sc., 
Biometrika, Vol. 1, pp. 345—360. In the former the writers urge “the import- 
ance of putting on record all the quantitative measures we can possibly ascertain 
of variability and correlation ” of characters of the human body. Although Finger- 
Prints, the characters dealt with in the present paper, cannot strictly claim to 
be quantitative it is hoped by the writer that the results may prove of some 
interest and use in the solution of the great Problem of Evolution in Man, 
especially when compared with the results obtained from the study of other 
measurements of the hand. 


The principal motive underlying most of the work which has been done in the 
past on the subject of Finger-Prints has arisen from the development of means of 
identification and it was based on the fact that the general pattern and character- 
istics of the finger-prints of any individual are persistent throughout life. As far 
as I am aware, however, no paper has yet been published attempting to measure 
the association between the various types of finger-prints in an individual or com- 
paring these with the relations which have been found to exist between other 
measurements of the hand. These are the objects of the present paper. 


2. Primary Classification of Finger-Prints. As primary classification 
Purkenje proposed nine types, Galton* three—each being divided into twenty- 
four sub-classes,—and Henry*+ four, these also being sub-divided into a number of 
classes. For the purposes of this paper I have adopted the method of dividing all 
the prints into four primary classes; I have also adopted Henry’s definitions and 

* Fingerprint Directories, by Francis Galton, F.R.S. Macmillan, 1895. 


+ Classification and Uses of Finger Prints, by Sir E. R. Henry, C.V.O., C.S.I. Wyman and Sons, 
Third edition, 1905. 


Biometrika x 54 














422 Association of Finger- Prints 


nomenclature as far as they are required, and these follow in general those of 

Galton. Secondary classification with its minute details is not used in this paper. 
The four classes referred to above are Arches, Loops, Whorls and Composites. 
In Arches the ridges run from side to side, consecutive ridges being roughly 

parallel and the curvature increasing in general from the base to the tip. 


(Plate XX. Fig. i.) 


In Loops some of the ridges are doubled back upon themselves making a half 
turn or a little more, the two parts of the doubled ridge diverging from each other 
at the centre of the pattern. (Fig. ii.) Consequently this pattern has an open 
mouth directed downwards either towards the right or towards the left of the 
finger. The direction of this opening supplies a means of subdividing Loops into 
Radial and Ulnar Loops according as the direction is towards the radius or towerds 
the ulna, that is, towards or away from the thumb. As will be seen later (p. 4228) 
the proportion of Radial Loops is very small except in the forefinger, so that this 
method of subdivision has been used only in dealing with that finger. 


In Whorls some of the ridges make a complete circuit, either as closed con- 
centric ovals or as a more or less continuous ridge forming a spiral. (Fig. iii.) 


Composites consist of combinations of two or more of the other patterns. 
(Fig. iv.) In this class are also included those finger-prints which are too irregular 
in general outline to be placed in any one of the other main groups. 


This class also includes the bulk of those patterns about which Sir Francis Galton, 
in his book on Finger Prints*, p. 79, states—* They are as much Loops as Whorls, 
and properly ought to be relegated to a fourth class.” It is possible, however, 
that some of Galton’s “ambiguous cases” may have been classed in this paper 
with Loops. 


For further details of these principal classes with their modifications and sub- 
divisions reference may be made to the works mentioned in the footnotes on 


p. 421. 


3. Material. The material on which this investigation is based consists of 
two thousand complete sets of finger-prints of adult males, part of a much longer 
series in the Biometric Laboratory of University College, London. They belong 
to the lower type of artisan and labouring classes. No selection whatever has been 
made, except that a few sets, which were incomplete or which contained prints so 
damaged as to be indecipherable, have been rejected. 

4. Symbols. The following symbols are used :—A = Arch, SZ =Small Loop ; 
LL = Large Loop (see p. 423); W=Whorl; C=Composite; L,= Radial Loop; 
L,,= Ulnar Loop; R=Right Hand; L=Left Hand. R,, R,, R;, Ry, R; designate 
the thumb, forefinger, middle, ring and little finger respectively of the right hand, 
and L,, L,, L;, L,, L; represent the corresponding fingers of the left hand. 


* Finger Prints, by Francis Galton, F.R.S., Macmillan, 1892. 











Biometrika, Vol. X, Part IV Plate XX 








Fig. (i). Arch. Fig. (ii). Loop. 





Fig. (iii). Whorl. Fig. (iv). Composite. 


Illustrations of the four fundamental types of Finger-Print. 


























H. Waits 422A 


5. Distribution of Classes of Finger-Prints. A preliminary survey of the prints 
brings to light a considerable clustering together of prints of the same kind. 
Thus, each of 241 sets contains prints of one class only ; each of 329 sets has nine 
prints of one class, and each of 194 sets contains eight out of the ten prints of one 
class; that is, each of 764 sets, or over 38°/,, has at least eight prints of one class, 
the large majority of these being loops. Again, each of 892 sets contains prints of 
two classes only, so that each of 1133 sets—or nearly 57 °/, of the whole—has 
representatives of not more than two of the four classes, On the other hand all 
four classes appear in only 95 sets, while the number of single hands, each of 
which contains at least one of every class, is only 23. 

For the calculations which follow it has been found advisable to subdivide the 
loops into two classes, Small Loops and Large Loops (p. 423). Considering these 
as separate classes, giving five types in all, the distribution of numbers of types 
for the two hands is shown in the following Table: 


TABLE 1. 
Distribution of Types in Right and Left Hands. 





Number of Types in Right Hand. 








& | 5 Totals 
- 1 | 2 | 3 | 4 | 6 Totals 
a 

> 1 37 | 84 47 6 -— 17 
Ag 2 65 | 465 | 360 | 61 | 4 955 | 
eM) 3 15 | 256 | 347 | 9% | 2 716 | 
n, Se 4 1 | 36 | 83 | 30 1 151 
aah out Ea Ge Bae ors 4 
S Totals] 118 | 842 | 839 | 194 | 7 [| 2000 
Zz ee a (Raa Le eee ook ee 











In this Table, taking as origin the cell (3,2) containing 360 types, we have the 
following results : 
Mean of Left Hand Types, “428 


Cy; ‘7628 
Mean of Right Hand Types, —°435 
ion ‘7608. 


We thus find the correlation coefficient (7) to be ‘281 + 014. 

The contingency coefficient (c), corrected for the number of cells, is ‘289. Hence 
we conclude that there is a distinct, though not very great tendency towards 
equality in the number of types in the two hands of an individual. It appears, 
however, that the divergence is rather greater in the right than in the left hand. 

The question now arises whether the difference in divergence in the two hands 
for the samples taken is significant. I have tested this by the method proposed by 
Professor Karl Pearson *. 

* «On the Probability that Two Independent Distributions of Frequency are really Samples from the 
same Population,” by Karl Pearson, F.R.S., Biometrika, Vol. vu, pp. 250—254, July, 1911. 
54—2 




































4228 Association of Finger-Prints 


TABLE 2. 
Divergence of Types in Right and Left Hands. 
Number of Types. 





1 : ae ae 4 5 Totals 
| | 
igen rae amee tates (ores 2s i Be 
Right Hand 118 | 842 839 194 7 | 2000 
Left Hand 174 955 | 716 | 151 4 2000 | 


| | ee ; ool 

For this Table 

x? = 33°72, 
whence FP is less than ‘000,005. 

That is, the odds are more than 200,000 to 1 against the occurrence of two such 
divergent samples if they were random samples of the same population. In other 
words the right hand generally tends to have a greater divergence of types than 
the left. 

The following Table gives the distribution of classes of prints for the various 
fingers of both hands: 
TABLE 3. 


Distribution of Classes of Prints. 





{ - L, W Cc 
Sab) 
R, 46 1104 l1 | 649 200 | 
2o 352 537 456 | 481 174 | 

Pa 212 1399 38 | 274 77 

R, 63 1015 17 | 729 176 

es 31 1631 3 228 107 

Totals | 704 5686 515 | 2361 734 

L, 91 1311 3 341 254 

ya 313 732 383 437 135 

Ls | 215 1408 35 240 | 102 

Zn | 66 1283 12 491 148 

L; 35 1727 — 150 88 

Totals ee 720 | 6461 433 | 1659 | 727 

| Totals for both hands | 1424 12147 948 | 4020 | 1461 

@ exek SES oe , | 





The most striking feature of this Table is the uneven distribution of the 
various classes, especially the large proportion of ulnar loops and the very small 








eT 


pon PR aR ETE 





ne et neha 


4 
: 
; 
| 








H. Warts 423 


number of radial loops except in the forefingers. A comparison of the distribution 
in the two hands shows considerable differences; e.g., in the left thumb the num- 
ber of arches is about double the’ number in the right; again, the whorls in each 
finger of the right hand are greatly in excess of those on the left, while the left 
hand has, in every case, an excess of ulnar loops. 


If we arrange the numbers of each class in order of magnitude, we see that the 
order for the arches is identical for the two hands and also for the ulnar loops. 
In each of the other classes there is one exception to the “identical” order. 


I have tested these distributions for each type by the method referred to in the 
footnote of p. 4224, with the following results :—In the arches the odds are more 
than 500 to 1 against the occurrence of two such divergent samples which are 
random samples taken from the same population; in the ulnar loops the odds are 
more than 200,000 to 1; in the radial loops about 5 to 2; in the whorls more than 
1,000,000 to 1, and in the composites more than 1300 to 1. 


We may thus fairly conclude that with the exception of the radial loops the 
frequency distribution of the classes between the fingers is different in the two 
hands and the radial loops are so few, except in the forefinger, as to be almost 
negligible. 


6. Subdivision of Loops. The great preponderance in the number of loops 
and the insignificance of the number of radial loops, except in the forefinger, make 
another subdivision of this class necessary. The method adopted is as follows :— 
All loops, in common with whorls and composites, contain certain well-defined 
points; these are (1) the “delta,” or “outer terminus,” and (2) the “ point of the 
core,” or “inner terminus.” [See Henry, pp. 22—24.] The number of ridges 
intervening between the delta of a loop and the point of the core may be anything 
from one up to about thirty; in only 38 cases out of the 13,095 loops does the 
number of ridges exceed 25; two of these are over 30, one being 32 and the other 
35. The complete distribution of ridges is given in Table 4 a. 

In dividing the loops into two sub-classes according to the number of ridges 
the nearest approach to equality is obtained by taking (a) those containing from 
1 to 12 ridges, and (b) those containing 13 or more ridges. For brevity I have 
called these classes (a) Small Loops, and (b) Large Loops; the terms “Small” and 
“Large” have no reference to the relative sizes of the patterns. The numbers in 
the two groups, thus arranged, are 7033 and 6062 respectively. 

Table 4b gives (1) the number of loops for each finger, (2) the means, (3) the 
standard deviations, and (4) the coefficients of variation in the numbers of ridges. 


Examining the Table below consider first the means. The order which is 
identical in the two hands runs: 

(1) Thumb, (2) Ring Finger, (3) Little Finger, (4) Middle Finger, (5) Index. 

It will be noticed that this order of the means is quite different from that 
of the relative areas of the patterns. 










Association of Finger-Prints 


TABLE 4a. 
Distribution of Ridges in Loops. 








































Right Hand. Left Hand. 
| | | 
Ridges | , ae a er a ae | Ry | 1, | Lo | ts | te | Le 
of site aS “5 ‘2 PELs SE Se, ee 
| | 
1 | 3 19 | 18 ae 2 0) So ss 10 
2 7 68 | 47 27 | 24 1s | 79 65 | 25 32 
A 76 | 50 26 | 49 e117 54 40 42 
4 8 71 66 35 | 68 22 71 60 33 56 
| 5 24 67 57 58 70 32 70 52 | 45 69 
6 14 47 | 80 36 | 92 29 54 47 | 32 69 
7 24 45 76 44 | 77 41 64 75 | 47 70 
| 8 29 50 | 85 45 | 73 54 67 70 | 34 91 
9 25 40 97 36 83 65 65 77 53 85 
10 43 17 | 109 55 | «(94 72 "3 - | 217 61 | 122 
11 53 55 «| 116 48 | 100 91 76 | 124 | 89 | 156 
12 53 69 | 120 71 | 133 4110 77 | 125 | 119 | 157 
13 65 | 62 | 116 | 66 | 111 | 98 | 65 | 141 | 94 | 134 | 
1h 89 60 | 128 76 | 143 4119 67 | 139 | 111 | 150 
15 68 60 84 79 | 103 | 100 49 | 99 94 | 131 | 
16 | 91 43 75 73 | 106 | 128 35 66 | 117 | 136 | 
| t7 | 86 38 62 60 | 97 92 26 | 47 86 | 88 
| 18 | 84 | 33 | 28 | 6 | 73 | 70 | 12 | 37 | 60 | 64 
19 100 16 11 34 46 7 | ll | 15 58 33 
| 20 | 61 | 12 5 | 34 | 48 39 | 7 6 | 33 | 17 | 
Jie ae 7 4 18 | 19 2% | 8 4 22 | 6 
22 34 2 2 19 | 3 13 | 6 1 mf 4] 
s,s | s 4 — 8 4 17 | — — + 3-4 
2h 22 1 1 4 6 rar 1 8 a 
2% 6] (ll --- 2 2 1 — — | 2 2 | 
26 7 ; : ee ee 1 . Seas =e 
27 7 = pre "4 ee a 1 = 
28 7 = a —— ae = ant _ l = 
29 l —_ l pa = on —_ —_ — | 
30 a 1 = 1 = = ws = 2); — |] 
32 = — — 1 es — ies — ao 
35 = at = —|- — _ a 1 | — | 
ak Ree a Pe PR San Fhe PRs 
Totals | 1105 | 993 | 1437 | 1032 | 1634 | 1314 | 1115 | 1443 | 1295 | 1727 | 









TABLE 40. 















Loops | Means Standard Deviations 





| Number of 
| 


R L R L tah dees: 


Tyee ton Sah Gina ee He 


|Thumb ... | 1105 | 1314 | 15°52+°10 13-274 -09 | 5°17+°07 | 463+ 06 
| Index .»| 993 | 1115 | 9°694+°12) 8°83+4-°10| 5°41+°08 | 4°88+ 07 
| Middle Finger | 1437 | 1443 | 10-414 -08 10°55 4-08 | 4-464 ‘06 | 4-534 -06 
| Ring Finger... | 1032 | 1295 | 12°374°12 12°774-10 | 5-484 08 | 5-094 07 
| Little Finger . | 1634 | 1727 | 11-75-08 1103-07 | 497406 | 4-46-05 








Coefficients of Variation 


34°34+ 53 | 34°85+ °51 | 
55°82 + 1°08 | 55°2441-00 | 
42°80+ °63| 42°91+ °63 | 
44°31+ °78| 39°85+ ‘61 | 
42°30 + 58 | 38-714 "50 | 








re an ee 





ee ee 


pave 




















HE SR ncaa 





H. Watts 425 


Comparing the two hands we see that the differences in the middle, ring and 
little fingers are insignificant ; in the thumb and index, however, there is a marked 
difference in favour of the right hand. 


The order of the standard deviations in the right hand is: 
(1) Ring Finger, (2) Index, (3) Thumb, (4) Little Finger, (5) Middle Finger. 


In the left hand the order of the last two is reversed, but the difference is 
small. 


With the exception of the middle finger, where the difference between the two 
hands is only about equal to the probable error and is therefore insignificant, the 
standard deviation is in every case greater for the right hand than for the left ; 
the differences are all of the same order of magnitude and range from about 
39 to *54. 

Coming now to the coefficients of variation—the order in the right hand is: 

(1) Index, (2) Ring Finger, (3) Middle Finger, (4) Little Finger, (5) Thumb. 

In the left hand the order of the ring and middle fingers is interchanged. 


Comparing the two hands we see that in three cases—the thumb, index, and 
middle finger—the differences are each less than the probable errors ; in the other 
two cases the variability is considerably greater in the right hand than in the left. 


I have carefully revised the calculations involved but have been unable to 
detect any error; neither can I suggest a reason for the large differences. 

In “A First Study of the Variability and Correlation of the Hand” (see p. 421), 
the writers find that the variability of bone lengths is closely related to the 
relative utility of the fingers, the least variability being that of the most useful 
finger. There appears, however, to be no such simple relationship between the 
ridges of the loops and the relative utility of the fingers. 

I have compared the distribution of ridges in the loops of the thumbs by 
Professor Pearson’s method (p. 4224, footnote), which gives x* = 166°64; hence the 
odds are much greater than 1,000,000 to 1 against the occurrence of two such 
divergent samples if they were random samples taken from the same population. 


of the five groups is now as 





The distribution—absolute and percentage 
follows (Table 5). 

In comparing the large and small loops it will be seen that in both hands there 
is an excess of large loops in the thumb, ring and little fingers, and an excess 
of small loops in the index and middle fingers. The order of these classes agrees 
in the two hands with one exception in each case. 


An approximate measure of the relationship existing between the various 
combinations of digits is given by the number of cases in which two particular 
digits on the same or on opposite hands have the same pattern. Table 6a gives 
the percentages for the same hand and for digits of the same name on opposite 












Association of Finger-Prints 


TABLE 5. 


Arches Small Loops | Large Loops 














| Whorls Composites | 
FG oe me eet ae ee cae Be prey 
No. % | No F No. % ee She ee ee 
R, Nes 46 2°30] 290 14°50] 815 40°75 | 649 32°45 200 10-00 | 
Re mae 352 17°60 | 654 32°70 | 339 16°95 | 481 24°05| 174 8-70 
Rs ie 212 10°60 | 921 46°05 | 516 25°80| 274 13°70 77 3°85 | 
Ry, ae 63 3°15 489 24°45 543 27°15 729 36°45 176 8:80 | 
R; os 31 1°55 | 870 43°50 764 38°20 228 11°40 107 =5°35 
Totals --- | 704 | 3224 | 2977 2361 734 
| 
| | 
I, ata 91 4°55 547 27°35 767 38°35 341 17°05 254 12°70 | 
5°65 | 833 41°65 282 14°10 437 21°85 135 6°75 


) a 
ae 75 887 44°35 | 556 27-80 240 12°00] 102 5:10 
Ty a 66 3°30 583 29°15 712 35°60 491 24°55 148 7°40 
1°75 768 38°40 150 7°50! 88 4:40 





Totals ie 720 3809 | 3085 1659 | 727 | 
| : |— | | 


Totals for both hands) 1424 7033 6062 4020 | 1461 


hands; the readings for other combinations of digits on opposite hands are given 
in Table 6d, p. 431, where all the patterns are grouped in three classes for the 
sake of comparison with Galton’s results. 


Remarks on Table 6a. (a) The percentages vary greatly with different 
combinations and with different patterns. 


(6) The means and totals for digits of the same name on opposite hands are 
all much greater than the corresponding readings for the right or for the left 
hand; the means, with one exception, and also the totals for particular combina- 
tions on the left hand are all greater than the corresponding readings for the 
right. 

(c) The order of magnitude of the totals is nearly the same for the two hands, 
those of the combinations including the thumb being, with one exception in each 
hand, the lowest. Hence, judging the relationship by the totals, it appears that 
(1) digits of the same name on opposite hands are the most closely related, the 
magnitude falling in order irom the little fingers to the thumbs; (2) omitting 
the thumbs, two consecutive digits are generally more closely related than others 
more widely separated ; (3) the digits of the left hand are more closely related 
than those of the right. 


(d) The relationship between the thumb and any other digit seems to be less 
close than that between any pair of digits not including the thumb; also, in both 














H. Warte : 427 


hands, the thumb appears to be most closely related to the ring finger, then to 
the little finger, next to the middle and least to the fore-finger. 


Another method of investigating the approximate relationship between the 
various digits is by means of a “centesimal” scale, as in Galton’s Finger Prints, 
Ch. vit. Table 6b gives such scale readings for small loops, large loops and 
whorls, for pairs of digits on the same hand and for digits of the same name on 
opposite hands. I have not considered it necessary to include other couplets 


TABLE 6a. 


Percentage of Cases in which various pairs of Digits possess the 
same Class of Pattern. 









































| 
Right Hand | Left Hand | 
Couplet zy : | fetal — —_ 7 7 | 
| 4 | SL| LL| W | c | A4|SL|LL| wile 
| aes = ake Tes ete es ae ae i hs S ¥ nie =a 
Rea sean ay a aE —— | 
Thumb and fore-finger 15] 63) 7:5) 13°0/1°4| 29°7 § 2:4) 15°7| 7:2) 83 | 13) 34°9 
mr middle finger 1°3 | 8°7| 11:72] 80} °7| 29°9 1-6] 16:2 13°4|} 4:9 1°4| 37°5 
zr ring a 7 65 | 12-4| 17-8} 11 | 385 | -7/ 132/171] 7-9] 1:5) 40-4 | 
a little  ,, | -4] 9:9) 15-0| 6-9| -9| 331 | -7/ 19:0) 15-7) 2-7| -9| 39-0 | 
Fore-finger and middle finger | 6°4 | 22°7| 7°2| 8°9|1:2| 46-4 75°8/26°3| 7°9| 8-6| 1-2) 49°8 
& ring ,, | 2°3|13-7| 6-4|163| -9 | 39°6 $25 | 17-8) 7-1) 12-7115) 416 | 
: little ,, | 1:2} 20°1 9-1 | 68 | -9| 38°1 }1°3| 26-3! 8-2] 4:3] -3| 40-4 
Middle and ring finger |2°5|16°8| 9:1) 12°0| °7| 41°1 $2°6| 2171) 15-4] 8-8] -8| 48-7 
2 a ‘1. 10 | 26°8| 14-0] 4:6} -4| 46-8 ]1-2| 28-9/ 15-3] 2-9| -6| 48-9 
| Ring and little —_,, .| 8 | 20°0 | 15-7 | 10°0 | 1-0| 47°5 | 9 | 23-7 19°5| 60! -8| 50°9 
| OP ode Fee eee a BS Bee oe 
' or eas - | | | | 
| Meavs | 1°8] 15°2| 10°8| 10°4| ‘9| 39°1 §2°0} 20°8| 12°7| 6°7/1°0| 43-2 
| 
peat s ee | SF Sk NESSES 
Couplet | 4 | SL | LL| W C | Totals 
| | 
7 ga Tree a? HE tgs SE eae 
Two thumbs . | 15 | 10-2 | 23-4 | 13-5 | 2-7 | 513 
» fore-fingers ... | 9°3| 22°7| 5°6| 14-4| 1:4| 53-4 
» middle fingers | 5°8|31°8 | 14°8 | 7°0| ‘9| 60°3 
» ring os 1 1°9 | 18°2 | 18°4 | 21:2 | 1°5 | 61°2 
» little os | 9] 36-1 | 27-2] 5°0| 1-4] 70°6 
of | | 
io. 1 | 23°8 117-9 | 122 | 16 | 59°3 | 
eS Ces WE ES MS 2d ee 
from opposite hands, because, as is shown later, the relationship between any 


pair of digits from opposite hands is practically the same as between the corre- 
sponding pair on the same hand. I have also omitted arches and composites from 
this part of the inquiry as the numbers belonging to these classes are, as a rule, 
comparatively small. 

Biometrika x 55 

























\ ° 








428 Association of Finger-Prints 


The scale reading for any pair of digits is calculated as follows :— 


Take, for example, the whorls on the right thumb and right fore-finger; the 
former has 32°5 and the latter 24 per cent. of whorls, while 13 per cent. of right 
hands have whorls on both thumb and fore-finger. Now from independent pro- 
32°5 x 24 
100 x 100 
this combination of digits and we therefore conclude that the remaining 52 per 
cent. of double whorls are due to a relationship between the digits. If we set 
aside the 7°8 per cent. out of the 32°5 and 24, we see that from the remaining 
24°7 and 16:2 per cent., the greatest possible percentage of double whorls would 
be 16:2; but as the actual percentage in addition to the 7°8 is 52, the centesimal 


. . 532x100 
measure of the relationship is — 69 


bability we shall expect x 100, or 7°8 per cent. of “double whorls” in 


, or 32, to the nearest unit. 


TABLE 68. 
Approximate Measures of Relationship between various pairs of Digits 
on a Centesimal Scale. 











| Right Hand Left Hand 
Couplet ——— 
SL LL W |Means}] SL LL | W |Means 
a a —| re) ae eas 

Thumb and fore-finger... 16 6 32 18 a7 | a1 | Se 27 
a middle finger... 26 0 38 | 21 26 16 28 23 
x ring 27 8 29 | 21 27 16 | 29 24 
little ri 44 0 42 | 29 41 4 23 23 
Fore and middle fingers ... | 44 22 54 | 40 34 39 | 64 46 
a ring * ae ta 15 49 | 33 33 23 | 44 33 
<e  e .. | 82 | 25 | 47 | 35 | 99 | 32 | 46 | 36 
Middle and ring __,, oe | 4 11 80 | 44 50 31 | 64 | 48 
Soca ee | 29 2 | 31 | 29 | 33 | 27 | 30 | 30 
Ring and little me | 67 | 32 81 | 60 64 26 74 55 


Couplet SE | EL W | Means 
Two thumbs ie 59 34 69 | 54 
» fore-fingers ... 48 27 55 43 
» middle fingers... 47 41 52 | 47 
» ring * 64 50 78 | 64 
» little “ 67 53 62 | 61 





Most of the remarks on Table 6a will be found applicable to Table 6b, with, 
at most, but slight modification; the chief differences are that the relationship 
between the middle and little fingers is not so high in Table 6b as in Table 6a, 
and the order for pairs of like digits is not the same in the two Tables. 














a ce 











lee ae 








H. Waite 429 


Comparison of results with those of Gulton. In order to compare with Galton’s 
results it is necessary to put large and small loops into one class and to include 
composites with whorls. Making some allowance for the difference of classification, 
and for any slight variation which may be due to the fact that the material is 
drawn from very different classes of the population, it will be found that there 
is almost perfect agreement between our data on all essential points. 


The relative frequency found in the two investigations was :— 


Galton Waite 
Arches 65 per ceut. 71 per cent. 
Loops ne. | Sie. os 
Whorls ae ie ee 


The differences are small in comparison with some found by Galton when 
examining the finger-prints of different races. For example, 1332 Hebrew 
children had arches on the right fore-finger in 13°6 per cent. of the cases, while 
only 7°9 per cent. of 250 English children had arches on that finger. 


TABLE 6c 
Percentage hhelied of Arches, Dotan and Whorls on the ee Digits. 





| | GatTon* WAITE 
From observations of the 5000 From observations of 20000 digits of | 
digits of 500 persons 2000 persons 
| 
| os 7 ————|— = 
Arches Loops | Whorls Arches Loops Whorls | 
Digit oe 30 oe ee ee in R L°,-2 e | 
|| — ——+- | — | 
| Fore-finger... | 17 | 17} 53) 53] 30) 28] 17°6 | 15°7 | 49°7 | 55-7 | 32-7 | 28°6 | 
| Middle finger | 7| 8) 78} 76 15 16 10°6 | 10°7 71°9| 72°2| 17°5 17°1 
| Little ,, fa 2 86 | 90 13 8 1°5 1°7 81°7| 86°4/ 16°8| 11°9 
| Thumb 3 | 5 53 65 44 30 2°3 4°5 55°2 | 65°7 | 42°5| 29°8 
Ring finger... 2; 3] 53|] 66) 45) 31) 3:2 3°3 | 516) 64°7| 45°2| 32-0) 
£ 
2 ie SN 
| Sain pet, eR | 
Totals | 30 | 35 | 323 | 350 | 147 | 113 | 35°2 | 35°9 | 310°1 | 344°7 | 154°7 | 119°4 | 
| } | 








Galton arranged the digits as in Table 6c, in order to bring out certain 
peculiarities. He says :— 


“The digits are seen to fall into two well-marked groups ; the one including the fore, middle, 
and little fingers, the other including the thumb and ring finger. As regards the first group, the 
frequency with which any pattern occurs in any named digit is statistically the same, whether 


* From Finger Prints, p. 116, Table II. 





a 
L 
‘ 

a 





430 Association of Finger-Prints 

























that digit be on the right or on the left hand; as regards the second group, the frequency differs 
greatly in the two hands. But though in the first group the two fore-fingers, the two middle, 


and the two little fingers of the right hand are severally circumstanced alike in the frequency 
with which their various patterns occur, the difference between the frequency of the patterns ; 


on a fore, a middle, and a little finger, respectively, is very great. 


“Tn the second group, though the thumbs on opposite hands do not resemble each other in 
the statistical frequency of the A. L. W. patterns, nor do the ring fingers, there is a great 
resemblance between the respective frequencies in the thumbs and ring fingers ; for instance, 
the whorls on either of these fingers on the left hand are only two-thirds as common as those 
on the right. The figures in each line and in each column are consistent throughout in 
expressing these curious differences, which must therefore be accepted as facts, and not as — 
statistical accidents, whatever may be their explanation.” (Galton, Finger Prints, p. 116.) 


These remarks apply with equal force to my figures although the actual 
percentages differ somewhat in certain cases, the most marked being in the | 
middle finger arches and the little finger whorls. 


The following points of agreement in the distribution of the patterns are also 
noticed by reference to Table 6 c. 


The frequency of arches on the fore-fingers is much greater than on any other 
of the four digits. “It amounts to 17 per cent. on the fore-fingers, while on the a , 
thumbs and on the remaining fingers the frequency diminishes in a ratio that Bx 9 
roughly accords with the distance of each digit from the fcre-finger. a 





ie | Ti 

“The frequency of Loops has two maxima; the principal one is on the little 
finger, the secondary on the middle finger. lp 
q i 
“ Whorls are most common on the thumb and the ring-finger, most rare on the oF 
middle and little fingers.” (Finger Prints, p. 117.) T 
In discussing radial and ulnar loops, which Galton describes as loops having - 
“inner” and “ outer” slopes, respectively, he says :— /R 


“In all digits except the fore-fingers, the inner slope is much the more rare of - 

the two; but in the fore-fingers the inner slope appears two-thirds as frequently 
as the outer slope. Out of the percentage of 53 loops of the one or other kind on 
the right fore-finger, 21 of them have an inner and 32 an outer slope; out of the 
percentage of 55 loops on the left fore-finger, 21 have inner and 34 have outer 
slopes. These subdivisions 21-21 and 32-34 corroborate the strong statistical 
similarity that was observed to exist between the frequency of the several patterns 
on the right and left fore-fingers ; a condition which was also found to characterise 
the middle and little fingers.” (Finger Prints, p. 118.) 


These statersents are true, in general, of my Table 3, but my percentages on 
the right fore-finger are 22°8 radial and 269 ulnar; on the left they are 19°2 and 
36°6 respectively. 

Close agreement is also observed in Table 6d which shows the tendency of 
digits to resemble one another in their various combinations. Galton omits 
combinations into which the little finger enters “because the overwhelming 


























H. Walt 431 


frequency of loops in the little fingers would make the results of comparatively 
little interest, while their insertion would greatly increase the size of the table.” 
(Finger Prints, p. 119.) I have*included them, however, for the sake of com- 
parison and completeness. 


My percentages are readily obtained from Tables LVI to C in the Appendix. 
TABLE 6d. 


Percentage of Cases in which the same Class of Pattern occurs in various Couplets of Digits. 





| 
GaLTon* Waite | 


Arches in Loops in Whorls in Arches in | Loops in 
| Couplet | | | 


ae) eee ee: Ee See | 


| Whorls in 





Same | Opposite Same | Opposite | Same | Opposite] Same Opposite |Same Opposite Same | Opposite | 
hand | hand | hand; hand | hand hand fhand| hand {|hand| hand jhand| hand | 


| | | 
| | 





| 





ees ae ey ed Sie |. — | 47° 








Two thumbs ... ef 2 | — | 24°5 
| ,, fore-fingers wef 9 — | 38 | 20 — 93 | — | 362 | — | 20°4 
» middle fingers ...| — 3 65 — | 9 “= 5°8 — 606 | — | 10% 
ee sh eg on 46 mae 26 —| 1% |—] #28 | — | 27°9 
oo ese a ; han ee a ie ee Ue 
| Thumb and fore-finger 2 2 35 «| 86383 16 15 1°90} 1°85 |36°8| 35°7 | 182 17°5 
a mid-finger | 1 1 eiai9 8 14 | 15 |470) 46-7 |109!] 105 
=~ ring finger| 1 l 40 38 | 20 18 | 6 | 41°0) 39:4 | 20°8/ 19°0 
. Fore and mid-finger ... | 5 5 48 46 12 11 6:1 5% | 44°3| 43°5 | 12°8 12°3 
» Ying finger ... | 2 2 35 35 17 17 J2-4 | 23 | 365) 35-7 | 208) 202 
Middle and ring finger| 2 2 50 50 13 12 2°5 2°4 48°3 47°1 | 14:7 13°7 
| Thumb and little finger | — — — — -- "52 "45 |54°2| 53°6 8°8 81 
Fore and little finger... | — — — = — 1°20; 1°15 |47°7| 47:2 93) 81 
Middle and little finger} — | — -- —- “= — fll | 10 |639) 635 | 65| 671 
| Ring and little finger... | — | — | — — — 8 | 56°1 | 54:8 |12°9] 11°8 





In commenting on his results in Table 6d, Galton says:—* The agreement 
in the above entries is so curiously close as to have excited grave suspicion that 
it was due to some absurd blunder, by which the same figures were made in- 
advertently to do duty twice over, but subsequent checking disclosed no error. 
Though the unanimity of the results is wonderful, they are fairly arrived at, and 
leave no doubt that the relationship of any one particular digit, whether thumb, 
fore, middle, ring or little finger, to any other particular digit, is the same, whether 
the two digits are on the same or on opposite hands.” 

It will be noticed, however, that while exactly half of Galton’s eighteen pairs 
of percentages, which are worked to the nearest unit only, are in strict agreement, 
] in all the other cases the result is one or two units less for two digits on opposite 
hands than for the corresponding digits on the same hand. In my figures the 
percentage for two digits on opposite hands is in every case the lower, and 
* Finger Prints, p. 120, Tables VIa and VI b. 








432 Association of Finger-Prints 


although the differences are small, ranging only up to 1°8 while four-fifths of 
them are less than 1, the consistency of the results suggests a slightly closer 
relationship between a pair of digits on the same hand than between the corre- 
sponding pair on opposite hands. This view is further supported at a later stage 
of this paper. (See Remark (d) on Tables 14-16, p. 450.) 


One further comparison is of interest, namely, the measure of relationship 
between the various digits on a centesimal scale. It should be noted, however, 
that while Galton’s means are based on loops and whorls only, omitting arches 
from his three groups, mine are based on small loops, large loops and whorls, 
omitting arches and composites from my five groups; also Galton gives no results 
for those combinations which include the little finger. 


TABLE 6e 
Approximate Mec .ures of Relationship between the various Digits, 


on a Centesimal Scale. 


GaLTon* WaltE 
Couplet Fhe ——a ~— ] oS 
Means Right Left 
Thumb and fore-finger ‘ee 24 18 27 
© middle finger... 27 21 23 
“s ring finger... | 39 21 24 
Fore and middle finger ae 60 40 46 
» Ting finger... eae 23 33 33 
Middle and ring finger soe) 52 44 48 
Right and left thumbs ee 61 54 
ns fore-fingers ... | 48 43 
~ middle fingers | 43 47 
ring fingers ... 65 64 





For the reasons given above we could hardly expect that these readings would 
be even approximately equal, but for all that, the same general relations are seen 
to hold good in the two sets of results. 


It is convenient at this stage to summarize a few of the most important 
points which have been brought to light in the foregoing pages. These are: 


(a) A greater divergence of types in the right hand than in the left. 


(b) A clustering of the same type in the hands of an individual. 


(c) The uneven distribution of the various types in the different fingers, 
especially the almost entire absence of ulnar loops except in the index. 





* Finger Prints, p. 129, Table VIII. 




















H. WaAItE 433 


(d) The differentiation of types in the two hands, in particular the large excess 
of whorls in the right hand and of arches in the left thumb. 


(e) Where there is any significant difference in the means, standard deviations 
and coefficients of variation in the numbers of ridges in the loops of the two hands 
those quantities are always greater for the right hand than for the left. 


(f) The relationship between digits of the same name on opposite hands is 
closer than that between any others; the digits of the left hand are more closely 
related than those of the right; and two consecutive digits, whether on the same 
or on opposite hands, are generally more closely related than others which are more 
widely separated. The relationship between the thumb and any other digit is less 
close than that of any pair not including the thumb. 


We may thus conclude that the left hand in its distribution of patterns is 
differentiated from the right and that individual fingers are associated in a differ- 
ential way with special types. We know that the right hand is differentiated from 
the left in use, and it would seem reasonable to suppose, even if we cannot account 
for the adaptation to use, that the finger-prints have been differentiated in accord- 
ance with this use differentiation. 


It may be suggested that the finger-prints if differentiated in accordance with 
diversity of use of the several fingers and of each hand follow a law of differ- 
entiated utility and not as the bones a law of maximum general utility of the 
finger. 


7. Correlation between the Classes of Finger-Prints. The object of this section 
is to obtain the associations between the various classes of prints and on the basis 
of these associations to enquire whether any Natural Order exists in which a 
certain degree of continuity may be assumed. For a complete investigation of this 
problem fifty-five Tables are necessary. They are: 


(a) 10 Tables of Classes for the Right Hand. 


(0) 10 » ” ” Left Hand. 
(c) 2 ,, - , Right against the Left Hand. 
(dq) 10 , . both Hands together. 


These Tables are given in the Appendix, pp. 453 et seq. 


The correlation coefficients and the contingencies have been calculated for the 
whole of these Tables. For all the restricted Tables, I to XX, and XLVI to LV, 
and in certain of the remaining Tables where the results of the other two methods 
are widely divergent, the correlation ratio has also been found. In these cases 
I have obtained » in both directions, the values of 7 given in Tables 8 and 9 being 
the square root of the product of the two »’s for each Table *. 


* The arithmetic instead of the geometric mean might have been taken, and there would not have 
been very marked differences. But the geometric mean has the advantage of a symmetrical value, i.e. 


Jaap aES Ate We- DF S Amy Hy - BHF 
ey Tn0y 
which has certain analogies with a coefficient of correlation. 








434 Association of Finger-Prints 


8a. Method of Calculating the Coefficients of Contingency in Restricted Tables. 
It will be noticed that the Tables I to XX and XLVI to LV, given in the 
Appendix, differ in general character from most correlation iables since the whole 
of the cells in the lower right-hand portion are necessarily empty. Consequently 
the usual method of finding the independent probability numbers for the purpose 
of calculating contingencies is not applicable to those Tables. The method which 
has been employed was suggested to me by Professor Pearson. It is as follows: 

Consider Table VI, Appendix, p. 454, which gives the distribution of small 
loops and whorls for the right hand. Commencing with the 45 hands which con- 
contain 5 small loops each, it will be seen that the independent probability 
number is the same as the observed, since a hand which has five prints of one 
class can have no other. In the next column the distribution of the 211 prints by 
independent probability is not in the ratio of 861 to 497 since 45 of the 861 have 
already been disposed of, but in the ratio of 816 to 497, that is, the numbers in 
the two rows are 131°1 and 79°9. Again in the third column from the right con- 
taining 3 small loops, 45 and 131°1 of the first row are accounted for and 79°9 of 
the second row; hence the independent distribution of the 306 in the third 
column is in the proportion of 6849 : 417°1 : 292; that is, the numbers are 
150°3, 91°6, and 641; and so on. 

Tt should be noted that the same independent probability numbers are obtained 
if we commence at the bottom of the first column with the 50 hands each con- 
taining 5 whorls and work horizontally instead of vertically. 

The differences between these independent probability numbers and the 
observed numbers are then used to find the contingency in the same way as in 
the ordinary contingency Table. 

No correction for the number of cells has been applied to the contingency 
coefficients in this type of Table as we have, at present, no appreciation of what it 


should be. 


The complete contingency Table, worked as described, is given below. 


Note on Calculation of Contingency Coefficients. It should be borne in mind 
that in finding the independent probability numbers in all contingency tables as 
well as in calculating the standard deviations, it is assumed that the distribution 
of the marginal totals is in the same ratio as would be the case if the whole 
population were taken ; in other words, that if , is the total of an array when a 
sample NV is taken and m, the total of the corresponding array when the whole 
population M is taken, then it is assumed that 


Ny = My 5 
1 = Ms >>. 
. *M 


Evidently the correct value of the independent probability number in the (s, s’) 
cell of an ordinary contingency table would be 


ane ps 
sf My U 
N 


= 


, 
MM ¥ Ir 


























H. Waite 


and the contributory contingency 


2 


N ) 
M? 


(rm —— mM,™ 





msm » iE 


TABLE 7. 
Contingency Table. 


Small Loops, RB. 

















0 1 2 3 4 5 Totals 
| | 
0 In 78 144 204 211 179 45 861 
B  200°6 16774 166°6 150°3 131°1 45 
wy —122°6 —23°4 37°4 60°7 479 
W?/8 74:93 3°27 8°40 24°51 17°50 
1 106 153 | 126 80 32 ~ 497 
122°2 101°9 | 101°4 91°6 79°9 
—16°2 51°] 24°6 -11°6 —47°9 a 
2°15 25°63 5°97 1°47 28°72 = 
2 130 92 | 55 15 x a 292 | 
es 85°5 71°4 71:0 64:1 a 
RS 44°5 206 | —16°0 ~ 49:1 
o 23°16 5°94 | 3°61 37°61 | — 
s 3 125 38 7 — 170 
63°8 53°2 53°0 
61°2 ~15°2 — 46-0 cate 
58°70 4:34 39°92 — 
4 104 26 _ -- oo 130 
70°9 59°1 _ — 
33°1 —33°1 — — 
15°45 18°54 — 
) 5O 50 
50 —_— 
| | 
| Totals 593 153 392 306 11 | 45 2000 
#= § (?/8)=399°82 2 = y2/n="19991 C= a SE 
x= 8 (y?/8) =399'82, p= x"/n= "19991, a=/ Tis 


mone = , NV 
Similarly the quantities m, YW’ Mey 


etc. would be the correct marginal totals 


to use in finding the independent probability numbers for the restricted contingency 
56 


Biometrika x 








436 Association of Finger-Prints 
tables and in obtaining the standard deviations, instead of the observed totals 
Ns, Ny, ete. 


However, as we do not generally know M, m,, m’,, etc., we are obliged to use 
the observed marginal totals as the nearest approximation we can get to the 


3 ‘ N Mex . 
correct values, although n, is not, in general, equal to m, Wk A similar assumption 


is of course always made in the formulae for the probable errors of samples, where 
the sample value is put ultimately for the population value. 


8b. Correlation Ratio of Restricted Tables. It is obvious that the ordinary 
method of calculating the correlation ratio also requires modification with Tables 


of this type; for this method is based on the differences between the means of 


the marginal totals and the means of the arrays. Now, in restricted Tables it 
would be impossible for the means of about half the arrays to approximate to the 
means of the marginal totals and it would be fallacious to base any conclusion on 
the deviations of the observed means from impossible values. 


A nearer approximation would be to take the pseudo » from the formula 
U/] 


yi? = Bite Ya = ai} 

Np — No,? > 
where qj; is the mean of an array of the independent probability numbers ; but 
the denominator of this formula must be modified in such a way that in a case 
of perfect association, y= unity. The desired result is obtained if we put >? 
instead of o,?, where 
js SS (y ae ali 


>? 
N 
We may write 
2 SS (y — Ya oe Ya ali) 
a= WV 
cs sot "7 ad > ? ite Ye — ai’) + 28S (y aR Ya) (Ya ey a¥i) 
N NV W 
_S (ne aa") S | Ny (Ya — a¥i)*} 
ie N + N , 


since the third term vanishes ; hence 


S {Ne (Ya — ali ?} /N 


nr=a = es 
’ S (Nz oq°)/N +8 {Nz (Ya — ali FF) N 
S ’ 2 
But ; (Ms <s =1-72, 
Na,? 
and S {Ne (Ya =n alfi)"} P 
: we ee = he; 


Na,? 


where 9, is the crude » found by the ordinary method. 








(2 SS aaeee 


~ 








~ 





H. Warre 437 


We have, therefore, the value of the correlation ratio of restricted Tables 
given by 


or 


= = a 
v1— Ne’ + Ny" 


The correlation ratio has been found by the method described above for all the 
restricted Tables; it has also been determined by the ordinary method for a few 
of the other Tables, but no correction for number of arrays has been applied. 
The results, together with the coefficients of correlation and of contingency, 
are given in Tables 8 and 9. 


Regression curves for all the restricted Tables are given on Plates (a—e). The 
continuous line is the independent probability curve and the broken line the 
curve of the observed means. It follows that the area between the curves, 
weighted, of course, with the marginal totals, gives a measure of the correlation 
ratio between the two characters. 


Each set of three figures for two particular characters, namely, those for the 
right hand, left hand, and both hands respectively; will generally be found to 
resemble each other closely. Irregularities occur chiefly with composites but this 
is not surprising if we consider the nature of this class. 


8c. Coefficients of Correiation of Restricted Tables. A glance at the diagrams 
of means of the restricted Tables, Plates (a—e), shows that the regression is 
generally non-linear ; it is also evident that a sensible value of r is introduced by 
the restriction*. Hence the value of 7 as found by the ordinary product-moment 
method is (i) too small because of the skewness of regression and (ii) too large on 
account of the restriction. These two contrary causes render the coefficient of 
correlation of restricted Tables unreliable and therefore quite valueless ; for even 
if it sometimes agrees fairly closely with the correlation ratio and the contingency 
coefficient, this agreement is probably due to the fact that the two sources of 
error counterbalance each other. 


In the remaining Tables, for which the results are given in Table 9, the 
regression is frequently skew; for this reason and for those given above, I have 
rejected the values of the coefficient of correlation in the sequence and have based 
my conclusions on the contingency coefficients, confirmed in general by the corre- 
lation ratio. 


* For example, in small loops and large loops, left, the case in which the difference between r and c 
is the greatest, the independent probability numbers have the correlation coefficient — 512 (instead of 
the theoretical value zero), as compared with —-507 of the observed numbers, In the case of arches 
and small loops, both hands, r for the independent probability numbers is — -148, as against + °147 of 
Table 8. 


56—2 





pol Bg 


en 23 Pie ee 
sail + 2h teen aioe 


‘gf aALVId 








‘soaang [enioy = seury uoyoag ‘seaany) Aqyiqeqoig yuepuodepuy = soury snonurqUos) ‘SPULIG-I9SULT TOF SOAINS) uoIssoIdOY 
“(ITA'TX eqeL) 9 “31g “(TIX P19%L) ¢ “Sta (1 rae) ¥ wal 
“spunyy yrog ‘sayauF ‘afa'T ‘saya IY ORT ‘saydey 

















‘qybry ‘sdooT absv'T 


== ==-1-- 
T 
v 











‘(IATX 19"L) ¢ “31a ‘(IX e19%L) & “ta (I eqeL) T “Bt 
‘spuoyy yg ‘sayoir ‘yfoT ‘sayouP “qybry ‘sayoup 








 <. -* * g 4 waa Ae 4 


rt 4 4 





T 
o 


yfay ‘sdooy yows 


‘spuvyT yuog ‘sdoo'yT yowg 
yybuz ‘sdoo'T yous 











” ALVIg 





‘SOAING [VNJOY =soury uayorg ‘soamng AyyIqeqor1g yuopuedepuy =soury snonutyu0y ‘s}ULIg-I0SUTg 10J saamng uorsseisoy 
‘(XITX °198L) 31 ‘31 ‘(AIX 919%L) TL ‘31a “(AT 91985) OT *8ta 
"spuvyy yiog ‘sayo1P "1fo'T ‘8ayoup ybny ‘sayour 


, ae S 


rt i 4 


EFTTA = On 











~~ .9---- O---- 











i 
N 
SE a. ae 


T 
o 


“yybry ‘sagrsodwog 


T 
o 


“yfar] ‘sazisodmog 


‘spunyT yqog ‘sajzisodwop 


o-— 











‘(MITATX 19%L) 6 ‘St "(TX 919%L) 8 “SIT “(IIT 19%) 2 "347 
‘spuvyT ywog ‘sayour yfaT ‘sayour "qybry ‘sayoup 


€ 


4 4 


eon esa ny 











“foT ‘s)40y Af 


“spunzyT yjog ‘3)10y 4 
“aubrey ‘s140Y A 











‘9 aLviId 


a ae 2 








> ne 





‘soaIng) [eNIDW = seuryT useyoag ‘soaing AqIquqorg juUepuodepuy = soury snonuMuoZ ‘sjyuLIg-1eSurg acy soaang uolsso1d0yy 
‘(ITATX 919%L) 9 “317 “(IIX 19%L) ¢ “Sry “(II 91qey) F “Bt 


‘spuvyT yg ‘sayour “afaT ‘sayoupr “qyObrnyz ‘sayoup 


Q ALVIG gol i 





‘seating [enjoy = seury ue yo1g *‘soainy) A4rpIqeqoig que puedopuy =soury snonurzuaory) *sqULIg -105UlL 7 AOfF SoAaNny) uOoLSso1s0y 
‘(IT P19%L) 8ST “Sta “(IAX 19%L) LT “3rT “(IA P19%L) OT “BT 
“spuveT 40g ‘sdoo'T 2)pus wfaT ‘sdoo'T yous “Mbuy ‘sdoo'T qvus 
8 Z 9 
ag 











onr ow + OM A 
‘spupyy yog ‘s7?40Yy 44 
WfoT “8)404UM 
‘y Bry ‘31404 


4 
1°?) 











“(I eTqeL) St “8tz “(AX 14%L) $1 “8d ‘(A BL) et “3 
‘spuvy yog ‘sdoo'T 2)DWg ‘yfaT ‘sdoo'yT 2)vug qybuy ‘sdoo'T 2)pug 











9 S 


R 


*spuvyy yiog ‘sdoo'T abav'T 
yfaT ‘sdooy alianT 
"yy Guz ‘sdooT abavT 











‘& Fivid 





‘saAIND [enjoy =soul'yT uoyoig ‘soaing Aqyiqeqoig quopusdepuy = seury snonuyuog ‘sqUlIg-leSulg JOY saamng worssoi5eyy 
“(ITIAX °19%I) ¢@ “814 “(ITIA P9%L) 22 *3tT 
*yfarT ‘sdoo'yT abav'T ybruy ‘sdooT abav'yT 


1 > 
{ 


“(IIIT 919") $6 “Bt 
‘spuvy ywog ‘sdoo'T aban] 
j 


- 











N 


“yfaT ‘8)40u 4 
ne “81404 At 


Bd hd 


o 


“spuvAy yg ‘s).L0y Af 











(ITT 98D) 12 “St “(ITA tae) GT “SI 
*spuvy yg ‘sdoo'T pus ; ; wyhiy ‘sdoo'yT 2ous 


6 8 Z 


1 


se ae 

















“yfaT ‘saprsodwog 
“qy Ory ‘sajisodmop 


‘spuvyy Yylog ‘sajrsodwog 

















Q ALVIg 
‘seating [enjoy = soury uexyoag ‘searing AyIpIqeqorg quo pucedopuy = seury snonulzuoy ‘“s}UIIg-19e5ulLq_ AOJ saaany) uoIsse1sey 
‘(IT 919%L) 8t “Sta “(IAX ®19%L) LT “8ta (IA °19%L) 9T “3T 


‘spuvyyT yzog ‘sdoo'yT 2ppumg ‘yfoT ‘sdooT yvug "yGnz7 ‘sdooT zvug 





2 cam 





‘seAING [BeNQoyW = seulyT uexHOIg ‘Ss Aang AqIqeqoug guoepuedopuy = soulry snonulquog ‘sqUulig-19sutyq aoy SAAING) uOISSALSOyY 
(AT °19%L) OF “31a (XX °19%L) 6% “31a ‘(xX 91981) 86 “Sta 
“spuvyyT yIog ‘8)40y 41 *yfoT ‘sou Ah “qybry ‘8.404 4 











Orn ----- 22 G------- - 


‘bry ‘sazsodwop 


WfeT ‘sajsodwoy 


‘spuvyyT yiog ‘sajzisodwop 











“(XI 91981) $3 “317 
"qybry ‘sdoo'y abav'T 
€ j 


1 i 


(ATT o19%L) 28 “Sta ‘(XIX 1981) 96 “81a 
‘spuvyT yiog ‘sdoo'yT abanyT *yfaT ‘sdoo'y abuv'T 











eee =9==- 


T 
N 
N 


‘saptsodmoy 


T 
o 


‘Mmbryz ‘sajzr8s0dwop 


*‘yaT ‘sazrsodwop 


‘spunyy "10g 
o 











aLVIg 





Whorls, Both Hands. 


Left. 


Whorls, 


Whorls, Right. 


Fig. 





Fig. 


29 (Table XX). 


Fig. 


28 (Table X). 


Recression Curves for 


80 (Table LV). 


Lines = Actual Curves. 


Independent 


Continuous Lines 


Curves, Broken 


Probability 


Finger-Prints. 


5S 








TABLE 8. 
z | | c | Table in 
yx"e | yx") p | yx") | xy"e zyIp | xy” 7 Appendix 
|" |" a 
| Arches and Small Loops aaa “062 | °192 | °242 | °227 | *228 | °266 | °264 | -251 | -305 I 
ns Large Loops ... |— °273 | °273| *194 | °198 | -298 | *215 | -220 | -209 | -246 II 


Right Hand 


Both Hands 


H. WaAIrTE 

















te Whorls ... ... |— "317 | °359 | *283 | °290 | *353 | -286 | -292 | :291 | °335 Ill 











| 
Composites . |—°146 | 154] 139 | °139 | +157 | *140 | *140 | -140 | *154 IV 
Small "Loops and Large Loops — 422 | -438| -074| -082 | -430| 073 | -080| -081 | -166 v. 
S Whorls — ... | — °585 | -622 | -313 | -371 | 574 | -315 | -385 | -378 | -408 VI 

Composites |—-270| -273| -207 | -210 | -273 | 197 | -201 | -205| -228| VII 
Large Loops and Whorls _... | — *234| *239| 079 | ‘081 | 305 ! +112 | -116 | 097 | -162 Vill 

Composites |—-093 | *138| -093 | -093 | *104 | -045 | -045 | -065 | -128 IX 
Whorls and Composites |= 020 | 162 | °117 | °126 | "032 | -061 | 061 | -088 | -137 xX 





| éoihen and Small Loops... | °038 | °197 | ‘271 | *267 | °249 | 301 | -292 | -281 | °335 XI 
a Large Loops ... |— °333 | 344 | *253 | 261 | °375 | °293 | -302 | -280 | -309 XII 
Whorls ... _—_... |— "255 | -286| -232 | -235 | -281 | -232 | -235 | -235| -274| XIII 
| Composites ... |—°154 | -159| +144 | *145 | °164 | °147 | -148 | -146 | *162 XIV 
Small ‘Loops and Large Loops | — °507 | °534| -095 | -112 | 510 | -023 | -025 | -055 | -120| XV 

. Whorls ... |—°565 | 623 -364 | -422 | -570| -310 | -353| 386) -419| XVI 

} Composites — 365 "392 | *298 | °309 | *401 | °281 | -293 | -301 *B1l XVII 
Large Loops : and Whorls_—... |—*160 | °193| °129 | -131 | °260 | °157 | *160 | 145 | -236; XVIII 

| Composites |— 080 | °131| 063 | °063 | ‘088 | ‘007 | ‘007 | -021 | 103 XIX 
Whorls and C omposites ... | “115 | -208 | 217 | -217 | “173 | 202 | -201 | -209 | -244 XX 

















Bars) RS SR ie | Car Saat EL RAs 


Arches and Small Loops —e °147 | 340 | °387 | °381 | °279 | °358 | -350 “365 | -440 | XLVI 


” 


” 


Composites... |~ -203 | +227 | +220 | -220 | -226 | -212 | -213| -216 | 239 | XLIX 








Small "Loops and Large Loops |- *471 | +527 | +162 | ‘187 | -476 | 059 | -071 | °115 | -234 | L 
a W horls ... |— 638 | °707 | °421 | *511 | °670 | °412 | *478 | °495 | *503 LI 
Composites |—°382 | °393 | °331 | 339 | °402 | 333 | 341 | 340 | *365 LII 
Large Loops and Whorls_ ... |— 147 | °194| *178 | +178 | *323 | 228 | -234| -204 | 333} LIII 
Composites | — 020 | °109| -085 | :085 | °087 | 066 | ‘066 | ‘075 | °181 | oY 





Whorls and Composites wad "150 | -280} °295 | 294 | bse "235 cael, "260 | °320 











Large Loops... |— 864 | 875 | +298 | *306 | *482 | “364 | 384 | *343 | 383 | XLVII 
Whorls ... ... |—°319 | -409 | °355 | 363 | 397 | °341 | 348 | 355 | -402 | XLVIIT 


| 
} 
| 
| 





Remarks on Table 8. A comparison of the Correlation Ratio with the Con- 
tingency Coefficient of the Restricted Tables. 


(a) The values of » and C are generally in very close agreement. 
(b) “The value obtained for » is, however, always less than that for C. 


(c) In only three cases doves the difference between » and C exceed 01. The 
probable error of » ranges from ‘015 for the smallest values to ‘011 for the iargest ; 
it will also be remembered that no corrections have been applied to 7 nor to C, 
since we do not yet know what these corrections should be for restricted Tables. 
We may assume, however, that, as with ordinary Tables, correction would modify 
n less than it would diminish C, and the corrected values of » and C would thus, 
in all probability, agree somewhat more closely than at present. 

Biometrika x 57 








Association of Finger-Prints 





TABLE 
Right and Left Hands. 
| Table in 
’ = ct | | Appendix | 
IDS a ai mee as | af Stee 
Arches R and Arches Z .. ... | + °686+4 008 664 | 688 | — | XXII | 
a Small Loops Z... | +°160+ 015 285 | °302 | °234+°014) XXII 
Large Loops Z —°297+°014 | °322 *337 -- XXIII | 
Whorls Z ... | —°257+°014 | 290 “307 = XXIV | 
Composites Z —'140+°015 | ‘118 ‘161 _- XXV 
Small Loops Rand Arches L | +°185+°015 | -309 325 | -283+°014| XXVI | 
oi Small Loops L|\ +°‘711+°007 | °631 635 sale XXVII | 
9 Large Loops Z| —°378+°013 | ‘382 *393 — XXV iil 
se Whorls Z —-494+°011 | °499 | 506 | — XXIX | 
Composites ZL | — 290+ -014 "292 ‘309 | — XXX | 
Large Loops Rand Arches LZ. ... | —*275+°014 297 314 | = XXXI 
” Small Loops Z | — 217+ °014 "262 282 | — XXXII | 
a Large Loops Z| +°550+ ‘011 19 | °525 | — XXXII 
‘ Whorls Z — 123+ °015 210 235 «| 159+ 015 XXXIV 
Composites Z | —°017+°015 ‘000 ‘089 — XXXV 
Whorls # and Arches L ... ... | —°308+ 014 °337 "351 — XXXVI | 
- Small Li Ops L — 555+ °010 *534 "540 — XXXVII 
s Large Loops Z + 021+ ‘015 283. | ‘301 | °170+°015 XXXVIII | 
S Whorls Z ... +°741 +°007 ‘670 672 ~ XXXIX | 
Composites L +°280+ 014 296 313 it XL 
C omposites Rand Arches L — 146+ °015 "115 "159 — XLI 
pa Small Loops . — ‘188+ °014 172 *203 — XLII 
mi Large Loops Z| + 131+°015 125 ‘166 a XLII 
9 Whorls £ +°059+°015 127 168 | -105+°015| XLIV 
Composites Z... | +°250+ ‘014 ‘367 379 — XLV 











Further Remarks on Tables 8 and 9. The results given in these Tables show:— 

(a) A general agreement between the correlations for the same pair of classes 
of prints whether obtained by different methods from the same Table (omitting 
values of r in Table 8), or from different Tables, the principal exceptions being 
those for which the correlation ratio has been calculated in Table 9. 


(b) A wide range in the magnitude of the results for different pairs of prints. 


(c) The association between any class of print in one hand and the same class 


in the other is, in general, as might be expected, much higher than any other 
association of these Tables. Omitting the composites the remaining four con- 
tingency coefficients between the same class in different hands are, with one 
exception, each greater than any others; the same may be said of the correlation 
coefficients, the exception in each case being the correlation between whorls in 
the right and small loops in the left hand, which is slightly greater than the 
correlation between the large loops in the right and left. Even with the com- 
posites the contingency for the two hands is greater than that for composites with 


* Values of contingency coefficients corrected for number of cells. 


+ Values of contingency coefficients not corrected for number of cells, given for the sake of comparison 
with other Tables. 


- iva / — 
} The value of 7 is in all cases V7,.n7,- 














owe 











PR RR 
































H. WaIrtE 445 


any other class found from any of the Tables, while the correlation coefficients have 
five exceptions to this general rule. 


(d) The contingency coefficients given in Table 8, where the two hands are 
taken together, are, with two exceptions, greater than the corresponding coefficients 
in other parts of Tables 8 and 9. The exceptions are (1) the contingency co- 
efficient ‘234 for small loops with large loops of Table 8 is slightly less than those 
in Table 9; and (2) the coefficient ‘503 for small loops with whorls in Table 8 is 
rather less than that for whorls (right) with small loops (left) of Table 9. 


A further study of the above Tables shows that :— 


Large loops are closest to arches. 

Arches , * whorls. 

Whorls : cS small loops. 

Small loops - = whorls and then to arches. 
Composites a so small loops and then to arches. 

The suggestion thus arises that arches and whorls have the closest natural 
resemblance to intermediate sized loops, and also that the “natural order” of the 
classes of finger-prints is :— 

(1) Large Loops, (2) Arches, (3) Whorls, (4) Small Loops, (5) Composites. 

This is more clearly seen from the following arrangement of the contingency 
coefficients, 

TABLE 10. 


Contingency Coefficients of Right Hand. 











pr Satin | Whorls | Small Composites 
40ops | | ii Le a: 
| | 
Large Loops... 1 | °246 "162 *166* | "128 
Arches ...... 246 | 1 | 9335 | -305 “154* 
Whorls ... pe 162 | 335 | 1 | *408 | *137 
Small Loops | °166 305 | °408 1 "228 | 
Composites | 128 | *154 | 137 "228 | 1 
| 





TABLE 11. 
Cen Se, banind g Left Hand. 





Large Arches Whorls Small Composites 

Loops Loops 

| Large Loops... | 1 309 236 | °120 103 | i 
Arches ... Sse “B09 1 ‘274 *335* "162 

| Whorls ... “ee 236 "274 l “419 "244 | 

Small Loops... “120 335 “419 1 “311 

Composites... 103 | :162 244 ‘311 1 


* Coefficients which do not agree with the proposed “‘ natural order. 





Association of Finger-Prints 





TABLE 12a. 


Contingency Coefficients of Right Hand with Left. 


Right Hand. 














es ee ee i, ‘3 
} pi | Arches | W Whorls| Smal Small | Composites 
oops | Loops | 
et os ee eee | 
8 | ig 
ty | Large Loops ... | ‘519 | -322 | -283 | -38a* "125* 
‘ Arches ... coe 664 | ‘337 | ‘309 "115 
S Whorls ... <cs ty See "290 “670 =| *499 127 
1 | Small Loops... | ‘262* "285 534 631 172 
| Composites ... ‘000 | ‘118 296" | +292 ‘367 
| 
(Corrected for number of Cells.) 
TABLE 120. ; 
Contingency Coefficients of Right Hand with Left. f 
' 
Right Hand. ' 
ae | Large | : Smz a 
| panna Arches Whorls pes Composites 
Sf Meares SEN ee ee ee Cee au Serie sire 
z | eee 
to Large Loops... 625 | °337 301 *393* "166* 
a. —e eee “B14 ‘688 *B51 *325 "159 
> | Whorls ... veo ft 28. | S07 “672 “506 168 
| Small Loops ....| ‘282* | 302 540 635 203 f 
| Composites «as |. Ooo ‘161 ‘313* “309 379 
| oe Ses ee, Oe Mies 
(Not corrected for number of Cells.) 
TABLE 13. 
Contingency Coefficients of both Hands taken together. 


Large 


Arches 
Loops 
Large Loops ds l *383 
Arches ... nae 383 1 
Whorls ... fea 333 "402 
Small Loops... 234 440 
Composites e 181 239 


Small 


W horls Loops 
333 234 
"402 *440* 

l *503 
“503 l 
*320 “361 


Composites 


181 

"239 

*320 

361 
l 


The contingency coefficients of the right hand with the left have been given 


both corrected and uncorrected for the number of cells and both sets of results 


point to the same conclusion. 


* Coefficients which do not agree with the proposed ‘‘ natural order.” 

















H. Warr 447 


The proposed “ natural order” of the types is supported by the above Tables, 
only eight coefficients out of the fifty-five not being in complete agreement. In 
four of these cases the difference is very small, most likely well within the probable 
errors, and they may therefore be regarded as insignificant. 


A similar arrangement of the correlation coefficients still further supports the 
proposed order, though not quite so conclusively, probably on account of spurious 
correlations, 


9. Association between the various Fingers. In this section I have calculated 
the contingency coefficients only, the classes being arranged in the order found in 
Section 8, p. 445. 


It would, of course, be possible to obtain Tables with much finer grouping 
either by further subdivision of the loops or by making use of the “secondary 
classification” described by Galton or Henry (see footnote, p. 421). All such finer 
grouping would raise the contingency; the extra labour involved by the addition 
of some three or four rows and columns to each Table would, however, be so con- 
siderable that the question arises whether some allowance can be made for the 
coarser grouping employed. This can only be done if we may suppose a “natural 
order” of some kind with a frequency roughly approaching the normal. This 
gives a rough upper limit to the contingency and is the purport of the work in 
the earlier sections on “ natural order” and corrections. 


As an example of the effect which finer grouping has on contingency I have 
found the contingency between the index fingers of the two hands by means of 
a “seven by seven” Table, the radial and ulnar loops being separated, and also by 
means of a “five by five” Table in which no distinction is drawn between the 
radial and ulnar loops. The results in this case, not corrected for grouping, are 
‘653 and 626; when corrected for grouping these results become ‘704 and °698, 
respectively. ‘They are so nearly identical as to suggest that no very material 
advantage would be gained by a further subdivision of classes. 


On the assumption that there is a certain degree of continuity in the distri- 
bution I have corrected all the results for grouping as well as for the number of 
cells. The method employed for the former correction is fully described by 
Professor Pearson in Biometrika*. 


The following Tables give the contingency coefficients for each finger with 
each other finger. The two sets of coefficients are included, viz. those which are 
not corrected for grouping, that is, which are obtained without any assumption of 
a “natural order” and those which are so corrected, in order that the conclusions 
based on the latter may be compared with those based on the former. 


* «On the Measurement of the Influence of ‘Broad Categories’ on Correlation,’ by Karl Pearson, 
F.R.S., Biometrika, Vol. 1x. pp. 116—139. 















Association of Finger-Prints 


TABLE 14a. 
Contingency Coefficients of Right Hand. 






































: | 
| | R | ae dr R, R | 
[22 SG NS ee Beet Bae 
| | | 
= Mies 4294-011 | 455 | -469 | 473 | 
| R | 429 | 1 | 645 | ‘576 | ‘519 
| Rs | 455 | ‘645 ae | 665 | +565 
| A, | *469 | ‘576 | 665 | 1 | +690 
| Rs | ‘473 | ‘519 | 665 | ‘690 | 1 
SE? ‘Se ep SRE) Sey Piers je 3 zy 
(Corrected for Grouping.) 
TABLE 140). 
Contingency Coefficients of Right Hand. 
| | eee ia Rs a ee m% +t | 
ie be) Stal | | F 
| } ; 
a oe 373+-012| 379 | -400 | 385 
R, | 378 | #1 ‘61 | ‘511 “441 
| & ‘379 «| “561 Fa | *568 -460 
| Ry | 400 | ‘511 | 68 | 1 ‘576 
| Bs | 385 | ‘441 | 460 | 576 | 1 
Dia SEE : = == 
(Not corrected for Grouping.) 
TABLE 15a. | 
Contingency Coefficients of Left Hand. : 
ie 7 | ae ae * et ee 
Siou 2, SR ATE 3 pa SL 3 
L, a 1 | 03 | -465 ‘474 | “5084-012 
| 208 | 1 | 675 | -609 ‘539 
Le 465 | ‘675 | 1 | °724 ‘585 
i ‘474 | -609 724 | 1 | “711 
Ls | “508 | 539 «| = °585 ‘711 1 
(Corrected for Grouping.) 
TABLE 156. 
Contingency Coeficie ients of Left Hand. 
) AY es... Gye 
| 
| hh | k Ls | Ly | Ls 
Be) Sa: | me —_ a. ae ak ae “8 
| | ¥ 
| 1 435 | © '390 ‘401 | -4104 014 | 
a ‘435 l 582 ‘529 «| «= -447 
i * ‘390 82 | 1 ‘611 | 471 
@ 401 ‘29 «| ~S611 1 ‘577 
| 471 ‘577 | 1 
| 


(Not corrected for Grouping.) 

















H. WaItE 


TABLE 16a. 
Contingency Coefficients of Right Hand with Left. 











| en Aa oe a aed, | Rf | 
| |B Ry oe ar ee 
on Seana —___—_|—_- sacaeas 
x ‘777 441 440 446 424 | 
e {698 [5x5 Table ae re Re 
Zz. 479 | 4 -n0417 "7 Table 640 | 559 521 
| Ze | “a7 608 786 669 561 
he | 446 ‘587 | 663 ‘814 675 


SOL > BES | ‘537 | 648 “899 





(Corrected for Grouping.) 
TABLE 166. 
Contingency Coefficients of Right Hund with Left. 





R, R, a ee ae | 

| | | | 
ee eee | | | 
I, | ‘649 "B85 | +368 | 383 347 
| £626 [5x5 Table] ee ; . | 
+419 : . *49° “AS | 
+) \ 653 y x7 Table] | >! | send + | 
Ls | 356 | *530 "656 | 72 “459 | 
IL, | ‘375 | -bl4 | 558 | -702 | 556 | 
| | 402 | 439 | 431 ‘534 707 | 





(Not corrected for Grouping.) 


Remarks on Tables 14a, 15a, and 16a. (a) It will be seen from these 
Tables that the association of types between corresponding fingers of the two 
hands is, with one exception, always closer than that between any other pair of 
fingers. The order of magnitude of these associations is :— 

(1) Little Finger, (2) Ring Finger, (3) Middle Finger, (4) Thumb, (5) Index 
Finger. 

(b) If we omit the thumb for the present, leaving it for separate comment, 
and consider the association between corresponding fingers as of the “first order,” 
that between fingers of consecutive rank, such as R, and R;, or R, and L, as of the 
“second order,” and so on, we notice a significant relation between any particular 
association and its “order.” Thus: 

First order associations range from ‘899 to “704 or °698, 
Second Z: > Z e ‘724 to ‘608, 
Third a a = 609 to ‘537, 
Fourth x e e - 539 to ‘515. 


The amount of overlapping in these ranges appears to be quite insignificant. 








450 Association of Finger-Prints 


(c) It follows from (6) that if in any of these Tables we start from a first order 
association and pass in any direction through those of other orders we find a 
continuous and rapid fall; that is, a finger is always more closely related to a 
consecutive finger than to one more remote (but see (a)); and the greater the 
difference in rank between two fingers, whether on the same or on different hands, 
the less close is the association between them. 


(d) The association between any pair of fingers in one hand is, in general, 
closer than either of the corresponding associations between a finger of the right 
and one of the left hand. There is one exception to this rule in associations of the 
second order, one in the third and one in the fourth. 


(e) The associations of the left hand are in every case closer than the corre- 
sponding associations of the right. 


(f) The associations of either thumb with any finger all fall below those of 
the fourth order of (6), and the range of the sixteen coefficients is only from °424 
to ‘508. As it is difficult to base any conclusions on these figures as to the 
relations between the thumb and the various fingers, I have carefully checked 
them by reworking the whole of the calculations involved, but have in every case 
arrived at the same result. I have also found the probable error* for the largest 
and for one of the smallest coefficients of the set. As the contingency coefficients 
are all of the same order of magnitude and the number of individuals the same in 
all cases, the probable errors of all will be of about the same magnitude and it is 
unnecessary to calculate more. The probable errors in the two cases being of the 
order ‘011 the differences in the contingency coefficients may be regarded as 
insignificant. Although in three cases out of the four the contingencies of the 
thumb with the middle, ring and little finger respectively are in ascending order 
of magnitude, the differences are so small in comparison with the probable errors 
that no conclusion can be drawn as to the relations between the thumb and the 
various fingers. We may notice, however, that the rule (d) holds good for the 
thumbs with but two exceptions. 


The contingency coefficients given in Tables 14 b, 156, and 16d, are all smaller 
than the corresponding results of the other series, but a careful study will show 
that the remarks (a) to (g) almost invariably apply to these Tables also. 


Note. In some preliminary work on this paper I classified the types as 
follows :—(1) Arches and loops with 1—3 ridges, (2) Loops with 4—10 ridges, 
(3) Loops with 11—14 ridges, (4) Loops with 15 or more ridges, (5) Whorls, 
(6) Composites. With this classification the following contingency coefficients 
were found for corresponding fingers of the two hands:—Thumb ‘686, Fore- 
finger 642, Middle finger 686, Ring finger *730, Little finger ‘738. These results, 
which were not corrected for grouping, are seen to agree very closely with those 


* The method employed is that given in Biometrika, Vol. v. Parts 1. and u., ‘On the Probable 
Error of Mean-Square Contingency,” by John Blakeman and Karl Pearson. 





























H. Warr 451 


of Table 16, the values being rather larger probably on account of the slightly 
finer grouping. i 

10. Comparison with Results of Previous Work. It would be well to compare 
briefly some of my results with those of the two works mentioned on p. 421. 

Whiteley and Pearson arrived at the following conclusions :— 

(i) The hand is a very highly correlated organ, far more highly correlated 
than the skull and even somewhat more so than the long bones. 

(ii) The parts of the left hand are distinctly more closely correlated than 
those of the right. 

(iii) The order of correlation of the first finger joints is identical for both 
hands. This order is as follows :— 

(a) The external fingers have the least correlation and the little finger always 
less than the index. 

(6b) A finger has always more correlation with a second than with any other 
finger from which it is separated by the second. 

(iv) With corresponding members on both sides the extreme pairs show least 
correlation, and the pair of middle fingers higher correlation than the pair of ring 
fingers. 

In the paper of Miss Lewenz and Miss Whiteley the chief results which are 
comparable with those for the finger-prints are the following :— 

(v) There is a slight, but we cannot say definitely significant, preponderance 
in the correlations of the right hand bones over those of the left. 

(vi) Dividing the hand into marginal members, i.e. thumb, index and little 
fingers, and central members, i.e. middle and ring fingers, and the bones into 
“lower bones,” i.e. distal and middle phalanges, and “upper bones,” i.e. metacarpal 
bones and proximal phalanges, the correlations roughly speaking are highest for 
the upper bones of the central members and become less as we move out from this 
upper centre towards the lower and marginal parts of the hand. This is true 
whether we take pairs in lateral or in longitudinal series. 

(vii) The highest correlations occur between corresponding bones of the right 
and left hands. 

(viii) Generally there is a “rule of neighbourhood,” i. any bone is more 
closely correlated with a second of the same series than with any other from which 
it is separated by that second. 

The above conclusions are to a certain extent mutually corroborative : e.g. (vi) 
and (iv) are in agreement, and (viii) agrees in substance with (iiib). Again (vil) 
agrees with Table IV, p. 130, of the “ First Study,” while (iii@) is in general sup- 
ported by Table XXII of the “Second Study.” On the other hand (ii) and (v) do 
not agree. It should be noted, however, that the “ First Study” was based on the 
measurements of the first finger joint only of both hands of 551 women, while for 
the “Second Study,” in which all the finger bones were measured, only 37 to 44 
Biometrika x 58 











452 Association of Finger-Prints 


skeleton hands were available. The writers of the latter paper state that in con- 
sequence of the comparatively small number of bones measured they look upon 
that study “as one of suggestion rather than of definite statistical proof,” and it 
is possible that with more adequate data their results might have been somewhat 
modified and exceptions less numerous. 

There appears to be no direct connection between finger bones and the patterns 
of finger-prints, but it is distinctly interesting to find that some of the most 
striking relations discovered amongst the former also exist in the latter. In 
particular, my conclusion (a) agrees with (vii), (c) with (iii b) and with (viii), and 
(e) with (ii) but not with (v). 

11. Concluding Remarks. The most important conclusions reached in this 
paper have been summarized on p. 432, and in the Remarks on Tables 8 and 9, 
pp. 443—445, and on Tables 14—16, p. 449; it scarcely seems necessary to re- 
capitulate them, but a comparison will show an almost perfect agreement although 
the sets of results have been obtained by entirely different methods. 


The essential results of the present paper are that finger-prints are not scat- 
tered at random over the fingers; certain types are more or less peculiar to certain 
fingers, and further the appearance of one type is correlated with the appearance 
of asecond. In this respect certain fingers are more closely related to each other 
than to any third finger, and the distribution of this relationship is in general 
similar to what is known of the like distribution of the correlations of the bones of 
the same fingers. 

It has been already stated that the material used is taken entirely from adult 
males of the lower type of the artisan and labouring classes; it would be of 
interest to compare the results obtained with those found from the finger-prints 
of females of the same grade of society, and also when the material is drawn 
from the professional classes. 


Tables I to XX, and XLVI to LV, are of a type which I have not previously 
met with; novel methods have accordingly been employed in calculating coefficients 
of contingency and correlation ratios from those Tables. The general investigation 
of Tables of this type offers an interesting problem, demanding further study. 


I am deeply grateful to Professor Pearson for placing at my disposal the 
necessary material together with a number of books and memoirs bearing on 
the subject, and for much valuable assistance given during the course of the 
investigation. 

It can scarcely be expected, with such a mass of numerical calculation involved, 
that the work should be entirely free from inaccuracies, but I trust that no serious 
errors have escaped detection. The laborious arithmetic has been much lightened 
by the use of a calculator, for the loan of which my thanks are due to the Govern- 
ment Grant Committee of the Royal Society of London. 


The Tables on which the preceding calculations are based are given in the 
Appendix, pp. 453—478, 














H. Waite 


APPENDIX. 


TABLE I. 
Arches and Small Loops, Right. 



















































































Arches, R. 
— oe SS Pi AeA een Oe 
| 0 1 | 2 3 | 4 5 Totals 
: | | 
x | | - 4 i es 
ve ee 544/ 26 | 16 | 1] 11 5 593 | 
MR 343} 65 | 16 14 | 15 — 453 | 
f S 2 256 | 73 45 is | — — 392 
wis 3 Se Saree ao 306 
=| 4 168 | 438 | — a a - 211 
S| 65 a ee ge hee 45 
a 
|Totals] 1541 | 294 | 111 | 33 | 16 | 5 } 2000 | 
TABLE II. 
Arches and Large Loops, Right. 
Arches, R. 
4 
5; _| 0-4 he a | 4 5 | Totals | 
"w i | 
- 0 286 | 78 | 49 | 23 16 5 457 
a.| 1 489 | 114 | 46 | 10 o | — | 689 
S| 2 400 | 66 12 0 = 478 
= 8 245 | 31 4 — — a 280 
@ | 4 103 5 — — — — 108 
oO | 4 18 — — | — _ — 18 
3 | 
| Totals} 1541 | 204 | 111 | 33 | 16 5 | 2000 
FS eS Bini 
TABLE III. 
Arches and Whorls, Right. 
Arches, R. 
0 a er Se BP" j 5 | Totals | 
| 
“a 0 512/215 | 86 27 16 5 861 
7 1 406 | 61 24 6 | Oo aS 497 
a| 2 275 | 16 1 o| — — | 292 | 
B| 3 168 | 2 ol -~ ‘| ae ae | 
| 2 130 0 - — — 130 | 
| Totals} 1541 | 204 | 111 | 33 | 16 | 5 | 2000 | 
TABLE IV. 
Arches and Composites, Right. 
ti Arches, R. 
4 0 | 1 | 2 | ¢g | 4 5 Totals 
-_ | | | 
- | o |1o42|o32 | 96 | 33 | 15 | 5 | 1493 
g 1 376 | 52 | 13 0 l - 442 
=3| # 107| 9 2 o | — | — fF ae 
3 3 13] 1 o}/—-/;—-—}]-— 14 
4 3 oo —_|j;— 3 
Ogee: el ee ee eX 
oO 
Totals | 1541 | 294 | 111 | 33 | 16 5 | 2000 


































Whorls, R. 


Composites, R. 


Whorls, R. 


Association of Finger-Prints 


TABLE V. 


Small Loops and Lurge Loops, Right. 


Small Loops, R. 














Large Loops, R. 



















































































0 | 1 2 | 8 4 5 Totals | 
| 
o | 104 | 69 | 67 | 81 | 91 | 45 | 457 
1 144 | 99 | 142 | 154 | 120 — 659 
2 143 | 141 | 123 | 71 | — | — Jf 478 
3 119 | 101 | 60 | — ny ae 280 | 
y ‘eee ee ee ee - | 108 
5 18 | eRe Ree gee, See Be 18 
} | 
| Totals | 593 | 453 | 392 | 306 | 211 | 45 | 2000 
TABLE VI. 
Small Loops and Whorls, Right. 
Small Loops, R. 
0 | 1 | 2 3 4 5 Totals | 
u | 
0 78 | 144 | 204 | 211 | 179 | 45 | 861 
1 106 | 153 | 126 | 80 | 32 | — | 497 
| 2 130 | 92 | 55 | 15 | — | — | 292 | 
| $ 125 | 38 7 — | — — 170 | H 
ee | aed oe Pe | aa ee ae 
5 50 | — | OE ae Co ee eee 50 
(Totals | 593 | 453 | 392 | 306 | 211 | 45 | 2000 
Aye 
TABLE VII. 
Small Loops and Composites, Right. 
Pp pP » frrg' 
Small Loops, R. 
| 0 1 2 3 | J 5 Totals 
| { | 
| o | 353 | 294 | 279 | 957 | 195 | 45 | 1493 
| 2 | 165 | 121 | 93 | 47 16 | — | 442 | 
2 62 | 34 | 20 S te ~ 118 | 
3 10 4 0 — — —- 14 
Se 3 ie a > i oe = 
| 6 o| — —|— RE = 0 
| Totals} 593 | 453 | 392 | 306 | 211 | 45 | 2000 
ey: - Hass ¥ | =| SES. 
TABLE VIII. 
Large Loops and Whorls, Right. 
Large Loops, R. 
0 1 . | 4 4 5 Totals 
o | 197 | 289 | 162 | 133 | 62 | 18 | 861 
1 86 | 130 | 148 87 | 46 — 497 
| 2 47 | 84 |101 | 6 | — | — J] aoe 
ee a7 | 7% | | — | — | — | 170 
= “Se 3 Pe ee ee es 
5 50 | — — — <3 sand 50 
200 | 























, KR, 


a 


omnosites 


C 


Composites, R. 

















Small Loops, L. 


Large Loops, L. 


H. WaltE 







































































TABLE IX. 
Large Loops and Composites, Right. 
Large Loops, R. 
| 0 1 | 2 | 8 | 5 | 5 | Totals | 
| | | 
0 317 471 318 205 | 94 18 1423 | 
1 88 153 130 57 14 — 442 | 
2 41 oie — — 118 
3 9 2 3 — — — 14 
Wer: 2 1 — — —_ — 3 
| § 2 ee Cee es 0 
| Totals 457 | 659 | 478 | 280 | 108 | 18 | 2000 
TABLE X. 
Whorls and Composites, Right. 
Whorls, R. 
ey | 2 3 | 4 5 | Totals 
| } 
0 648 | 331 180 108 | 106 50 1423 
5 oe 160 | 126 84 48 | 24 Be 442 
| 2 49 | 31 | 24 Se ee 118 
| 3 se. 4p—}=]}— 14 
2 1 2 AS oe ee 3 
| 5 0 — - — — — 0 
| Totals | 861 | 497 | 292 170 | 130 | 50 } 2000 
TABLE XI. 
Arches and Small Loops, Left. 
Arches, Z. 
0 1 2 3 4 5 Totals 
i 
| 
0 463 22 Ee SE | 1 4 496 
1 328 | 44 23 11 | 19 —_ 425 
2 248 | 59 28 ae eee = 366 
$3 189 | 81 45 — — — 315 
4 199 | 84 | — —_ — 283 
5 a < = —-|- _ 115 
Totals } 1542 | 290 | 102 | 42 | 20 | 4 | 2000 
= | —— _ = —" 
TABLE XII. 
Arches and Large Loops, Left. 
Arches, Z. 
o | 1 2 | 8 | 5 5 | Totals 




















Association of Finger-Prints 


TABLE XIII 
Arches and Whorls, Left. 




































































Arches, L 
| 0 1 2 | 3 | 4 5 | Totals 
ee | 0 766 | 233 | 89 | 41 | 19 | 4 41152 
Loeg 350 | 44 | 12 IT 4 |) See 
wl 2 193! 9 1 6) = |e 
5| 38 120 et =e —|— — 124 
=| 4 e;.e0/—/—-;—-j-— 92 
Sis Se es ee = 21 
} | 
Totals | 1542 | 290 | 102 | 42 | 20 | 4 | 2000 
TABLE XIV. 
Arches and Composites, Left. 
Arches, L. 
| | 0 | tf eae: 8 4 6 Totals | 
. | 
SN | | 
-| © | 1048 | 237 | 87 | 40 20 | 4 | 1436 
o| 1 375 | 42 15 2 0 = 434 
=| 2 94 9 0 0 — — 103 
S| 3 20| 2 0 — i 22 
| oa 4;-0j,— a ae 4 
5| 5 tes os ee ee 1 
0 | | 
| Totals | 1542 | 290 | 102 | 42 | 20 | 4 | 2000 
eS th eR ea Ae ees 
TABLE XV. 
Small Loops and Large Loops, Left. 
Small Loops, L. 
0 1 2 s | 4 | 5 | Totals | 
sj 
mS oe 82 | 48 | 63 | 83 | 124 | 115 | 515 
S.{ 12 131 73 | 94 | 130 | 159 | — | 577 
= 2 97 | 116 | 108 | 102 — | — 423 
_ 3 100 | 110 | 101 — —_ |— 311 
“ J iwi eo = 
m0; 5 33 —|— — — —- 33 
= | 
Totals | 496 425 366 315 283 115 2000 
TABLE XVI. 
Small Loops and Whoris, Left. 
Small Loops, L. 
. 
0 | | Sea CS ie ee 5 Totals | 
| | 
3 0 96 | 183 | 248 | 251 | 259 | 115 | 1152 
"| 2 [103 | 139 | 8 | 57 | 2 — | 408 
a = Be Be Sees oes F 
5 8 88 31 5 jj —-|— — 124 
= 4 83 Se fe Ee Te 92 
el 5 fa}; —};—}—|-| - 21 
| Totals 2000 | 














| 











ae 








Pan ee 





Composites, L. 


Composites, L. 


H. Walter 


TABLE XVII. 


Small Loops and Composites, Left. 
Small Loops, L. 















































Composites, L. 






































0 | 1 2 | s | 4 | 5 | Totals | 
0 | 234 | 292 | 264 | 2964 | 267 | 115 | 1436 
1 1176 | 110 | 84 | 48 | 16 434 
2 64 | 19 | 17 Bee ee 
3 17 4 ere Bee 22 
=. 4 0 = — — 2 4 
| 4 1 adh Mice * Waves Wet oon. 1 
| Totals 496 | 425 | 366 | 315 | 283 | 115 | 2000 
TABLE XVIII. 
Large Loops and Whorls, Left. 
Large Loops, L. 
| 0 ie oe | 3 4 | 5 Totals | 
| o | 338 | 315 | 195 | 165 | 106 | 33 | 1152 
= 63 | 94 | 114 | 102 | 35 408 
) 2 22 | 58 | 79 | 44 | — | — | 208 
3 ie (et) = eee 
Ss si .@i— — | ae 92 
| 5 21 = —|—-;- = 21 
Totals | 515 | 577 | 423 | 311 | 141 | 33 | 2000 
TABLE XIX. 
Large Loops and Composites, Left. 
Large Loops, L. 
0 1 2/8 | 4 | & [Totals 
| 
if 
0 | 380 | 399 | 276 |'230 118 | 33 | 1436 
1 |102 | 124 | 119 | 6 | 23 | — | 434 
2 “er at ae eee st fC 
3 7 oe ch | ie 22 
5 2 Sf me Pe Poe 4 
5 1 | — aes _ a | a | ~Tt 1 
Totals] 515 | 577 | 423 | 311 | 141 | 33 | 2000 
TABLE XX. 
Whorls and Composites, Left. 
Whorls, LZ. 
o | 1] 2 | 3 4 5 | Totals 
| 
0 924/249 | 120 | 66 | 56 | 21 | 1436 
1 173| 120 | 59 | 46 | 36 | — | 434 
2 a) } | | — | — § oe 
3 8| 11 Re bes ae 22 | 
4 2; 2; — ne a a 4 | 
5 Soe ew ee ee 1 | 
Totals | 1152 | 408 | 203 | 124 | | 





































Small Loops, 


Large Loops, L. 


L. 


Association of Finger-Prints 


TABLE XXI. 
Arches, Right, and Arches, Left. 
Arches, R. 














TABLE XXII. 
Arches, Right, and Small Loops, Left. 


Arches, R. 


0 1 Cie) Vier Sees es ea eae Totals 
| | 
0 | 1369| 145 | 24 3 | 1 | Oo | 1542 
1 146 | 97 | 40 7 | 0 | O | 290 
? 24 45 26 5 2 tv) 102 
3 2 6 | 16 12 | 6 0 42 
y 0 Bites € 5 3 20 
5 o| 0 0 0! 2 2 4 
| Totals | 1541 | 294 | 111 | 33 | 16 | 5 | 2000 











Whorls, 

















0 1 2 ay te 5 Totals 

| | { | 

| | 
0 473 14 ) 0 3 3 496 
1 342 47 17 10 6 3 425 
4 249 68 29 15 5 0 366 
3 209 74 28 2 2 0 315 
A 187 66 26 t 0 0 283 
5 81 25 6 S | l | 0 115 
Totals] 1541 | 294 | 111 | 33 | 16 5 | 2000 

TABLE XXIII. 
Arches, Right, and Large Loops, Left. 
Arches, R. 
0 1 2 3 4 5 Totals 

0 304 95 68 28 15 5 515 
1 440 | 104 28 4 1 0 577 
2 363 50 9 ] 0) 0 423 
3 269 | 36 6 0 0 0 311 
4 133 8 0 0 0 0 141 
5 32 1 0 0 0 0 33 

| 
Totals | 1541 294 111 | 33 16 5 2000 





TABLE XXIV. 
Arches, Right, and Whorls, Left. 
Arches, R. 








0 1 2 3 h 5 Totals | 
| 
| @ 760 | 243 97 31 16 | 5 41152 

1 354 | 42 11 1 oO |: @ 408 
‘®t 193 8 2 | 0 0 0 203 

3 121 1 Se ae 0 0 124 

4 92 0 eS }; 8 0 0 92 

5 21 oi -0 0 0 21 




















H. Walter 


TABLE XXV. 
Arches, Right, and Composites, Left. 















































Arches, R. 
0 a | 3 | 4 | & [Totals 
| | | 
S| 0 | 1048 | 244 | 95 | 28 | 16 | 5 | 1436 
uf 1 373 | 44 13 0 434 
s 2 95| 5 2 Lt *@ 0 103 
Z 3 20 <9 es ¢] 1.4 22 
a] 4 i ee oe 0 | 0 | 0 4 
= 5 l 0] oO 0 | oO 0 1 
i} | | 
i) 
Totals} 1541 | 294 | 111 | 33 | 16 | 5 | 2000 | 
TABLE XXVI. 
Small Loops, Right, and Arches, Left. 
Small Loops, R. 
| ee ae = 3 4 5 Totals | 
| 
Wj P 6 558 | 369 | 279 | 177 | 129 | 30 | 1542 
oh Ft 25 49 74 74 | 57 11 290 
ee 4 15 18 40} 21 | 4 102 
S| 3 1 13 14 ot: 2 ave 42 
oa 3 5 7 at-4 0 20 
" [eae 2 2 0 0:10). do 38 4 | 
| | | 
| Totals | 593 | 453 | 392 | 306 | 211 45 | 2000 
| 











: TABLE XXVII. 
Small Loops, Right, and Small Loops, Left. 
Small Loops, &. 

















| 0 1 2 3 J 5 | Totals | 
ny | 
a 0 373 97 21 5 0 0) 496 
a} 2 148 | 149 | 88 35 1 1 125 
S 2 53 | 118 | 103 65 26 1 366 
3 14 56 98 86 55 6 315 
=a 4 30 63 90 78 18 283 
a 5 l 3 19 25 48 19 115 
RN | 
Totals} 593 | 453 | 392 | 306 | 211 | 45 J 2000 
| | 





TABLE XXVIIL 
Small Loops, Right, and Large Loops, Left. 
Small Loops, &. 

















| ew OS 4 | 6 | Totals | 
a : z -_ 
5 ee 92 76 89 | 109 | 115 34 15 
| 1 1544 108 | 125 ~~) ‘121 65 9 | 577 
zi sg 137 | 117 90 52 25 2 423 
mI) 38 117 | 107 | 60 | 21 6 0 311 
o| + 70 42 27 2 0 0 141 | 
BO] 5 23 ~ l | l o | oO 33 | 
| | 
| ww | 
- | Totals | 593 453 392 | 306 | 211 | 45 2000 


Biometrika x 59 





Association of Finger-Prints 


TABLE XXIX. 





























Small Loops, Right, and Whorls, Left. 
Small Loops, R. 
o | 1 | 2 | 8 | 4 | & | Totals 
~ 0 159 | 217 286 249 | 198 43 1152 
1 130 | 143 79 43 12 1 408 
a2} 2 112 | 58 19 | 13 | 1 0 208 
s| g | 92 | 2 | 6 | 1/| 0 | o | 124 
= 5 79 10 er Ss | 0 1 92 
= 5 21 0 | 0 | 0 0 0 21 
| Totals | 593 | 458 | 392 | 306 | 211 | 45 | 2000 | 
TABLE XXX, 
Small Loops, Right, and Composites, Left. 
Small Loops, &. 
BE SPS EMRE ORe Vat a Sees 
| @ .}. 2 2 | 3 4 & Totals 
sr 0 323 | 311 | 299 | 267 | 191 45 | 1436 
gf 1 194 | 109 82 32 17 0 434 
= 2 60 22 11 7 3 0 103 
S| 3 12 10 0 0 0 0 22 
a) 4 4 0 0 0 0 0 4 
ei 66 0 l 0 0 0 0 1 
@) 
Totals | 593 | 453 | 392 | 306 211 45 | 2000 | 


Large Loops, Right, and Arches, Left. 


| 0 


TABLE XXXI. 


Large Loops, R. 


| 2 





























| 





1 3 4 5 , Totals | 
-| 0 | 290 | 483 | 394 | 255 | 103 | 17 | 1542 | 
wae 1 69 | 123 70 22 5 l 290 
ot 2 51 36 12 3 0 0 102 | 
a 3 24 16 2 0 0 0 42 
2 4 19 l (8) 0 8) 0 20 | 
< 5 4 0 8) 8) 0 0 4 | 
Totals | 457 659 178 | 280 108 18 | 2000 | 
TABLE XXXII. 
Large Loops, Right, and Small Loops, Left. 
Large Loops, R. 
= 0 ae ee. 3 4 5 Totals | 
~ 
o 0 112 | 134 | 139 73 32 6 496 
= 1 68 110 | 110 89 41 7 425 | 
} 2 76 114 91 52 29 4 366 | 
3 65 | 128 81 38 3 0 315 
= 4 90 | 121 | 45 | 23 3 l 283 
EB 5 46 52 12 5 0 0 115 
RQ | 
| Totals] 457 | 659 | 478 | 280 | 108 18 } 2000 | 





















H. Waits 

















TABLE XXXIII. 
Large Loops, Right, and Large Loops, Left. 
Large Loops, R. 
























































0 | 1 | eS oe Se Totals | 
) | | { 
~| oO |264 | 178 | 51 19 3 0 | 515 
2. 1 121 | 256 | 135 59 5 1 577 
8 2 53 | 135 | 149 67 18 1 423 | 
= 3 17 66 | 101 77 41 9 311 | 
o| 4 2 | 20 | 36 | 47 | 32 4 } 141 | 
PO) 5 0 4) 6] 1 9 3 33 | 
a] | 
— 
| Totals} 457 | 659 | 47 280 | 108 | 18 | 2000 | 
| } v | | | SR 
TABLE XXXIV. 
Large Loops, Right, and Whorls, Left. 
Large Loops, R. 
0 | 1 | 4 $4) Vy} | 5 Totals 
| 
if ] 
nj | 9 | 262 | 399 | 238 | 163 | 75 | 15 | 1152 
nt ee 69 | 114 | 125 | 73 | 24 3 | 408 
a 2 35 | 66 | 66 | 31 | 5 0 | 203 
3 37 41 Da eB ae 0 124 
a 4 40 | 37 | 11 3 1 0 92 
= 5 4] 2| 4 1 0 0 21 
| Totals 457 | 659 | 478 | 280 108 | 18 | 2000 
TABLE XXXV. 
Large Loops, Right, and Composites, Left. 
Large Loops, R. 
| | 1s 2 | 3 \ 5 | Totals | 
- | 
S| oa : | 
te 331 | 478 | 338 | 195 | 81 13 | 1436 
Sf 2 90 | 140 | 107 70 23 4 434 | 
=| ¢ | a7 | ot} | is | 4] 1 § 108 
S| 3 6 5 9 2 0 0 22 | 
ay 4 2 2 0 0 0 0 4 | 
} 5 1 eo] .% 0 0 0 1 
o | 
| Totals | 457 | 559 =| 478 | 280 | 103 18 7 2000 | 








TABLE XXXVI. 
Whorls, Right, and Arches, Left. 
Whorls, BR. 





Ee ee ae (Pes RR PM es va | 5 Totals 





nj | 0 526 | 405 | 262 | 169 | 130 50 | 1542 
oe ee 193 69 | 27 1 0 0 290 | 
3 2 80 te 0 Oo 1. 8 102 | 
= 3 38 4 | 0 0 i 42 | 
=| 4 0 |: 8 is 0 Re ite | 20 
< |? i et » 0 | Oo 0 4 | 











| | 
| Totals | 861 | 497 | 292 | 170 | 130 | 50 J 2000 





59—2 


















Whorls, LZ. 


Small Loops, Z. 


Large Loops, L. 


TABLE XXXVII. 


Association of Finger-Prints 


Whorls, Right, and Small Loops, Left. 


Whorls, R. 






































Composites, L. 
































o|1|2 ra | 4 | 5 | Totals | 
0 59 | 90 | 97 |108 | 94 | 48 | 496 
1 137 | 120 | 95 | 44 27 | 2 425 
2 170 | 126 | 50 12 8 0 366 
3 1194 | 80 | 35 5 | 1] of 315 
4 | 202 | 65 15 1 | 0} oO | 283 | 
5 99 16 0 0 0 | oO 115 | 
{ f | 
| Totals | 861 | 497 | 292 | 170 | 130 50 | 2000 | 
TABLE XXXVIII. 
Whorls, Right, and Large Loops, Left. 
Whorls, R. 
0 | BESS 4 | & | Totals | 
| 
o 1311 | 83 | 36 | 29 | 98 | a8 | 515°] 
we. 241 134 80 46 | 58 | 18 577 | 
| 2 4126 | 128 87 49 | 32 | 1 423 | 
| 8 112 93 | 63 | 34 7 | 2 | 3a 
ee 58 47 21 ll oe ae 141 
| 65 13 12 5 1 ae 33 
Totals | 861 | 497 | 292 | 170 | 130 | 50 | 2000 | 
TABLE XXXIX. 
Whorls, Right, and Whorls, Left. 
Whorls, R. 
0 1 2 3 4 5 Totals 
0 768 281 77 17 9 0 1152 
eS 79 |160 | 110 | 41 15 3 408 
| 2 ll | 42 71 | 45 32 2 203 
» 2 2 | 10 25 | 36 40 11 124 
y 1 4 7 27 32 21 92 
| 5 0 0 2| 4 2 13 21 
Totals | 861 | 497 292 170 130 | 50 2000 
TABLE XL. 
Whorls, Right, and Composites, Left. 
Whorls, R. 
= | 2 3 4 | 5 | Totals 
i 
0 4744 | 331 | 185 | 93 | 55 | 28 | 1436 
1 98 | 132 78 58 | 52 | 16 434 
2 16 | 28 19 | 16 | 19 5 103 
3 3 5 a ee i Be 1 22 
5 0 1 2 1 | 0 0 4 | 
5 0 0 1 0} 0 0 1 | 
| Totals} 861 | 497 | 292 | 170 | 130 | 50 | 2000 





























Large Loops, L. 


Whorls; L. 


Arches, L. 


Small Loops, L. 


Composites, R. 


H. Waite 


TABLE XLI. 
Composites, Right, and Arches, Left. 





Totals 
































ot. ees ey ee 
0 |1042| 378 | 107 | 12 | 3 o | 1542 | 
1 233 | 46 9 2 0 0 290 | 
2 84| 16 ¢ | 0 0 0 102 | 
Be, 40| 2 oe 1-0 ol 6 42 
J 20| 0 0 0 0 0 20 | 
5 4| 0 0 0 0 0 4 | 
| Totals] 1423 | 442 | 118 | 14 3 0. | 2000 | 
TABLE XLII. 
Composites, Right, and Small Loops, Left. 
Composites, 2. 
a TE oe | 8 4 | & | Totals 
0 299 | 135 | 52 8 2 0 496 | 
Be 294 | 100 | 26 5 0 0 425 
| 2 262 | 82 | 20 l 1 0 366 
ae 237 71 7 0 0 0 315 
| 4 233 | 38 | 12 0 0 0 283 
5 98 | 16 1 0 0 0 115 
| | 
| Totals] 1423 | 442 | 118 | 14 | 3 | © | 2000 | 








TABLE XLIII. 





Composites, Right, and Large Loops, Left. 


Composites, R. 
































en ee a ee 3 4 5 Totals 
0 416 | 80 17 2 0 0 515 
1 416 | 121 36 3 l 0 577 
2 278 | 111 27 6 1 0 423 
3 210 | 79 20 1 1 0 311 
5 86 | 37 16 2 0 | Oo 141 
4 17 14 | 2 0 0 0 33 
Totals | 1423 | 442 |118 | 14 | 3 | © | 2000 
TABLE XLIV. 
Composites, Right, and Whorls, Left. 
Composites, R. 
0 | 1 2 3 4 5 Totals 
0 878 | 219 el ee 2 0 1152 | 
1 245 | 116 40 6 1 0 408 | 
2 129 | 55 °'| 17 2 0 0 203 
3 s9| 98 | 5 2 0 0 124 
J 64| 21 6 1 0 0 92 | 
5 mis 0 0 0 0 21 | 
| | | 
| | 
Totals | 1423 | 442 | 118 | 14 3 | 0 | 2000 | 








inangcabe Rep arpmticene si Soe eee 





























' 
a 
; 
; 
; 








TABLE XLV. 
Composites, Right, and Composites, Left. 


Composites, R. 





Association of Finger-Prints 





Totals 





Composites, L. 


CONWNOS 


1436 
434 
103 

22 











0 1 2 | 
0 | 1100! 276 | 53 
258 130 43 
2 57 | 98 | 14 
; 8 8 4 
4 0 0 3 
0 | 0 1 

Totals 1423 | 442 | 118 | 


TABLE XLVI. 





Arches and Small Loops, Both Hands. 









Totals| 





Small Loops. 


Arches. 
| 2 | oe ee a ee 

1 1 1 0 
7 6 3 0 
12 7 1 0 
13 9 8 6 
25 10 2 6 
25 18 10 10 
21 18 15 — 
23 21 — — 
18 wate) he a 





373 | 
245 
223 
226 











TABLE XLVII. 
Arches and Large Lvops, Both Hands. 








Totals| 





Large Loops. 


DO WNVA AS CONS 


my 








Arches. 

| 2 | 3 l | 5 
24 32 19 13 
34 23 ll 6 
33 | 18 | 4 3 
27 9 4 0 

| 16 5 1 0 
9 2 0 0 

2 1 | 1 — 

0 o;j— _ 

0 —_— oe id 








PeTTUPl Tee 
S41 title 


264 
299 
360 
306 
279 
193 
136 
95 
52 
13 
3 




















2000 
































H. Waite 


TABLE XLVIII. 
Arches and Whorls, Both Hands. 




















































































































Arches. 
0 | 1 | 2 | "SS ee | 6 | 7 | 8 | 9 | 10 | Totals 
0 | 351|160 |100 | 66 | 30 | 21 | 16 | | 5 | 5 | 2 | 768 
1 240| 69 | 25 | 16 7 ae ) 0 | 0 | — | 360 
2 191 38 | 1l 5 2 0 | oe. 0 Be eo 248 
a 3 146 m; Ss 2 1 0 o-; ® — | —— — 171 
= 4 124 6 1 1 0 0 | i a Li eae ee 132 
g| 6 86| 3 0 0 0 Of me Pel ee} ee 89 
E| 6 7) 1 0 0 el | ES ere eee Beets We 78 
7 71 ei 68 0 — —-|j- | = — ~ — 71 
8 47| 0 | Ot — fe be ee Pe ee me 47 
9 SS ee ee ee ee ee es es ne Gee 23 
10 13 | aha — _ — |} — — —_ — 13 
| | | 
Totals 1369 | 291 | 145 | 99 | 40 | 22 | 19 | 12 | 5 5 2 | 2000 
| ye | 
TABLE XLIX. 
Arches and Composites, Both Hands. HW 
Arches. WW 
| es yy 1 2 3 st 23 8 7 | 8 | 9 | 10 |Totals| i] 
| | iH 
l i 
| 0 650 | 193 | 101 | 62 | 33 | 18 | 18 | 1 5 | 5 2 | 1098 th 
we 406 | 66 | 34 | 19 6 3 1 1 0 | 0 — | 536 
~ 202 | 22 7 7 1 1 0 0 | 0 —)|— 7 se 1} 
S| 8 To 3 1 0 0 0 —_ —j}|— 85 i} 
al 4 a2 | 3 0 1 0 0 0O;—{|—|]—}]— 26 
g| 6 7} Oo} O-| Oj oO var ee tees es 7 / 
= | 6 6| oO 0 0 Bf we Te | sea Foe Fe, Se 6 i 
o| 7 |. 2o 0 0 — — — OE) a | — 1 
| 8 1] 0 te eras eens een ee pee ee eros 1 ) 
| 9 0 | 0 -- ~ — — — oo — os | — 0 ' 
| 10 o}—),—-;}/—}—-}/—};—-};—-}-|}-]- 0 
| 
| | | | 
| Totals] 1369 | 291 | 145 ; 90 | 40 | 22 | 19 | 12 5‘| 5 2 | 2000 
TABLE L. 1 
Small Loops and Large Loops, Both Hands. if 
Small Loops. 1} 
| o{[i1|e¢|s|4]6|6|7 | 8 | 9 | 10 Jrotas 1} 
| | | J i ; 
0 44 | 13 | 10 | 18 | 17 | 20 | 31 | 31 | 31 | 30 | 19 | 264 
= 44 | 19 | 13 | 19 | 14 | 30 | 37 | 45 | 42 | 36 | — | 299 1 
si 60 | 24 | 24 | 30 | 35 | 42 | 54 | 55 | 36 | — | — | 360 i 
a/ ¢g os | a | & | @) a | a | oe |e | | — | — © uit 
3 4 49 34 41 44 | 46 40 25 = — — 279 i 
Hi 5 35 | 39 | 38 | 33 | 2 | 19 | — | — | — | — | — | 193 
2| 6 am lat eel ae me em Pe Pe ee ee PP 1 
a | 7 24 | 29 | 22 | o;/—j;/—/];—|— | —};—|— 95 il 
. ta 6 20) 16} —|—};—}/—}—}—|]—) — 52 i 
9 9; 4)/—/—/}]/—-]/—-;—-};—-};-|-)| - 13 ii 
10 3 — — |— an, Te — — a os — 3 VW 
Totals} 373 | 245 | 293 | 225 | 198 | 198 | 179 | 165 | 109 | 66 | 19 | 2000 Al 



































Whorls. 





Whoris. 


Composites. 


TABLE LI. 


Association of Finger-Prints 


Small Loops and Whorls, Both Hands. 


Small Loops. 




































































0 e281) iced ee ae 9 | 10 |Totals| 
| 
| 0 23 | 34 | 52 | 68 | 81 | 105 | 111 | 126 | 87 | 62 | 19 | 768 | 
i 19 43 39 55 59 | 50:| 42 30 19 4 — 360 | 
i § 29 46 44 a0 0 1 SR 2 3 = oe 248 | 
ins 32 34 38 30 | 15 14 6 2 — — — 171 
y) 46 27 25.1 16 12 3 3 = — a — 132 | 
= 47 21 14 5 2 0 _ — - = — 89 | 
=. 52 17 C1; 2 0 - — — -- 78 | 
7 50 16 4 l — — -- _ — 71 | 
8 39 7 lj — - - _ > = 47 
9 23 0 - - = = aoa Pate 23 | 
10 13 . ‘ = = 2 = 13 
| Totals] 373 | 245 | 223 | 225 | 198 | 198 | 179 | 165 | 109 66 19 | 2000 
TABLE LIL. 
Small Loops and Composites, Both Hands. 
Small Loops. 
0 1 2 3 4 5 6 z 8 9 10 {Totals 
0 114 92 | 113 |] 115 | 111 | 121 | 121 | 136 95 61 19 | 1098 | 
1 128 70 71 64 53 | 60 41 26 13 5 = 536 | 
2 78 46 23 37 21 | 15 16 3 l — — 240 | 
a 32 24 13 6 8 1 l 0 — —_ = 85 | 
1 5 i3-| 7 2 te. l Op af ar fe LS 26 | 
2 2 J 1 0 0 0 ~ — — — — 7 
| 6 5 l 0 0 0 — _ — = — 6 
Be 0 ] 0 0 as ~ — — — io l 
8 l 0 0 — — —_ ~ — -- — aes : 
9 0 0 =a . — - = at tes vias 0 | 
10 0 - ss ae amie z = fe 0 | 
| Totals] 373 | 245 | 223 | 225 {| 198 | 198 | 179 | 165 | 109 66 19 | 2000 | 
TABLE LIII. 
Large Loops and Whorls, Both Hands. 
Large Loops. 
0 1 2 l 5 6 7 8 9 10 Cotals 
| 0 156 | 138 | 133 91 73 51 43 12 29 9 3 768 
i 4s 24 5 | 54 | 63 | 56 | 39 | 30 | 29 | 16 4 - | 360 
ae 12 24 36 35 50 35 30 19 7 - — 248 
= 12 16 3 21 35 35 17 5 : ~ — 171 
4 8 7 20 32 26 23 16 - - 132 | 
5 3 8 20° | 20 28 10 . cat = — 89 
6 4 9 24 | 30 11 - — — — - 78 | 
7 10 18 29°} 14° — — -|— — 71 | 
8 Ll 22 14 | — |— — _ - | — — 47 | 
9 11 2} —|]—}]—-j]— —{—jy-|]—-|{-— 23 
10 13 = aU Came ea = at Poy a - 13 | 
Totals | 264 | 299 | 360 | 306 | 279 193 136 95 52 13 3 2000 











































Large Loops 


H. Waits 


TABLE LIV. 


and Composites, Both Hands. 


Large Loops. 






































Biometrika x 














| 4 o 6 “ote ae? 9 10 {Totals 
| i | | 
| | | 
0 160. 4 37 188 | 144 130 104 79 51 36 | 9 3 1098 
a 39 | 76 |100 | 96 | 94] 50 | 32 | 32 | 13 | 4 | — J 536 
a| 2 28 | 29 45 | 40 42 30 16 7/ 38] — _ 240 
3| 38 10 ll 19 | 17 8 8 4 #4 }—|—- 85 
=| 4 3 6 5 | 6 3 1 2);/—/—|—-—]}]— 26 
S 5 1 l eT 2 1 0 eae ee (oS pee = 7 
2, | ! } | 
q| 6 “SS ee l 1}/—}]/—}—]—|]—-|]-— 6 
° 7 l | o.4°- @ Oo | — -— — | — —— | oe os 1 
Se SP Bee” ae ee Pe eee: ey ee ee ee re Ey A. 1 
9 ie Ses Oe = ss = ae = ao = ae 0 
10 0 = = os ae ae aes pa ae ss = 0 
Totals | 264 299 360 306 279 193 | 136 95 52 13 3 2000 
TABLE LV. 
Whorls and Composites, Both Hands. 
Whorls. 
0 1 D 3 | x 6 y 8 9 10 {Totals 
0 555 =| 172 109 71 68 26 23 27 22 12 13 1098 | 
1 148 114 71 7 30 40 34 26 15 1] — 536 
> 2 42 52 4] 33 22 12 16 12 10 — —_— 240 
a L 2 l 0 
2 3 17 15 15 12 8 8 4 6 — — — 85 
a7 4 1 4 7 5 3 3 1 — — = — 26 
S 5 1 3 l 1 0 sib woe gy A apa oa 7 | 
a| 6 | 1 1 2 l = = = aE et = 6 
} ? oto l 0 na = aly el = = =e, 1 
oO 8 0 | 0 as = aos sla ws a, fe a 1 
9 O 0 —_— a a sei eared iy Sani = ca 8) 
10 0 — — —— —_— ee seam <n — — — 0 
Totals] 768 | 360 | 248 | 171 | 132 89 78 71 17 23 13 | 2000 
TABLE LVI. 
Right Thumb and Index. 
Right Thumb. 
| “1 a) Oe Cc | Totals 
58 | | 
= A 29 97 148 50 28 352 | 
| SZ 12 125 320 | 139 58 654 | 
LL 2 27 149 | 125 36 339 
a W 1 26 144 260 50 481 
| oO 2 15 54 75 28 174 
ce | 
| Totals 46 290 815 649 200 2000 | 


























Association of Finger-Prints 


TABLE LVILI. 
Right Thumb and Middle Finger 
Right Thumb. 



























































Bo | A SL | LL | W C | Totals | 
| 4 25 66 | 75 31 15 212 | 
2| SL 18 174 | 428 | 215 86 921 
3 | LL 1 26 | 223 | 208 58 516 | 
= W 2 15 | 69 | 160 28 274 | 
a| ¢ 0 9 | 20 35 | 13 77 | 
2 : 7 | 
2 | Totals | 46 | 290 | sis | 649 | 200 2000 | 
TABLE LVIII. 
Right Thumb and Ring Finger. 
Right Thumb. 
5 A | SL LL | WwW C | Totals 
an | | 
S 
od A 4 | wl] wil 7 1 63 
no | SL 17 | 129 | 248 64 31 489 
a |. LL 5 60 | 248 | 166 64 543 
2 W 8 | 55 229 | 355 82 729 
eS C 2 22 /; a ee 22 176 
2 
op | z : 
rae] Totals 46 290 815 649 200 2000 
TABLE LIX. 
Right Thumb and Little Finger. 
Right Thumb. 
5 | A ee} Se W C | Totals | 
0 | 
S| 
i 8 “2 ee 3 l 31 
2 | SZ | 30 | 198 | 414 | 162 | 66 870 
> | LL 5 61 300 304 94 764 
=| WwW 10 58 | 137 22 228 | 
Si ¢ 2 7 38 43 17 107 
a | 
Eo} 
=~ Totals | 46 290 815 649 | 200 2000 
TABLE LX. 
Right Index and Middle Finger. 
Right Index. 
W C Totals 


| A SL LL | 








Right Middle Finger. 

















H. Warr 


TABLE LXI. 
Right Indea and Ring Finger. 
Right Index. 







































































8 A | SL-| LL | Ww} © | Totals | 
EN 
|: 45 a ae 0 0 63 | 
i 46-48 | oe | on 21 | 16 489 
2 LL 84 | 189 | 127 92 | 51 543 
oa W 36 130 149 | 325 | 89 729 
inne Cc 26 46 43 | 43 18 176 
“oo | ‘ ee ; 
g | Totals} a2 | 654 | 339 | 481 | 174 | 2000 | 
} 
TABLE LXII. 
Right Index and Little Finger. 
Right Index. 
ws fa %e ws | ee 2 ¥ ra 
y. SL | LL ae } ‘otals | 
Sp | { | L | LL W c Totals | 
m| a 23 ri ok * | 0 31 | 
o!| SL 222 402 | 97 | 106 | 43 870 | 
z LL 95 | 199 181 199 | 90 764 | 
3 W 4 26 38 | 136 24 228 
pi Cc 8 | 19 | 2 40 17 107 
a | 
on me e & a 
ss Totals | 352 654 339 | 481 174 2000 
TABLE LXIII. 
Right Middle and Ring Fingers. 
Right Middle Finger. 
i | A a 1 2k Ww C | Totals 
oF | 
i A 49 14 0 0 0 63 
wo | SL | 104 | 336 43 2 4 489 
a LL 32 298 181 16 16 543 
o| W 16 189 240 240 44 729 
oe Oe ll 84 | 52 | 16 13 176 
Sn | 
oB | Totals | 212 | 921 516 | 274 | 77 2000 
TABLE LXIV. 
Right Middle and Little Fingers. 
Right Middle Finger. 
S| A | SL | LL W C | Totals | 
fa - | 
et a 2 | 1 0 0 0 31 | 
o| SL 147 | 535 129 39 20 870 
3 | LL 37 298 | 280 114 35 764 | 
=| 0 47 | 7A 92 15 228 | 
ol SUE . pi: 29 7 107 
| 2 ee mee 
“g | Totals | 212 | 921 516 | 274 77 2000 

















tie ¥ SESE ener e eee 


FIRM Pw fn 
pihongen “Sinan — 


ESL I See CN 
9 Sd > Sn 























Left Index. 


Left Middle Finger. 


Association of Finger-Prints 


TABLE LXV. 
Right Ring and Little Fingers. 


Right Ring Finger. 





Right Little Finger. 


Left Ring Finger. 










































































| 767 | 341 





A | SL | LL | W | Cc | Totals 
bg 16 Mo bes2y fs 31 | 
| SL 41 400 | 212 155 | 62 870 | 
=: 4is@i| ss 297 83 764 
= 0 6 11 199 | 12 228 
| © hg ees 6 7 | 19 107 | 
Totals | 63 489 543 729 | 176 2000 

-_ = apt SS aS 
TABLE LXVI. 
Left Thumb and Index. 
Left Thumb. 
A | SL | LL | W | C€ | Totals 
| 
A 47 | 133 78 | 25 30 313 | 
SL 32 | 313 | 355 68 65 833 | 
LL 5 40 143 47 47 282 | 
W 3 46 136 | 166 86 437 
ee: 4 15 55 35 26 135 | 
Totals | 91 547 767 341 | 254 2000 | 
TABLE LXVII. 
Left Thumb and Middle Finger. 
Left Thumb. 
A | S&L | LL | w | © | Totals 
A 31 96 51 17 20 215 
SL 46 323 354 85 79 887 
LL 7 86 267 113 83 556 
W 6 31 62 97 | 44 240 
C 1 ll 33 29 28 102 
Totals | 91 547 767 341 | 254 2000 
TABLE LXVIILI. 
Left Thumb and Ring Finger. 
Left Thumb. 
A SL | LL | W | Cc | Totals 
A 38 aay 3 66 
SL 45 | 264 198 38 38 583 
LL 21 | 152 | 342 110 7 712 
W 5 60 | 172 158 96 491 
C 7 33 | 48 30 30 148 
| | 
| Totals 91 547 | | 2000 








vor 





















Left Little Finger. 


Left Little Finger. 





Left Middle Finger. 


Left Ring Finger. 


H. WalItE 


TABLE LXIX. 
Left Thumb and Little Finger. 
Left Thumb. 

































































Pee es a ee | ss | 
A | SL-| LL | W | ¢ | Totals | 
A ae ae ae ee Se 35 | 
SL 62 | 380 | 377 | 81 59 959 | 
LL 12 123 | 314 | 180 | 139 768 | 
Ww 3 13 | 44 | 53 37 150 
ei 1 13 | 30 26 | 18 88 
Totals} 91 | 547 | 767 | 341 | 254 | 200¢ 
TABLE LXX. 
Left Index and Middle Finger. 
Left Index. 
ail 
| A “| | 8 ( Totals | 
| 
fee. 117 84 9 y 3 215 
SL 166 | 525 80 79 37 887 
LL 19 | 187 | 157 | 139 | 54 556 
W 6 25 21 171 | 17 240 | 
C 5 | 12 15 46 24 102 | 
| Totals | 313 | 833 | 282 | 437 | 135 | 2000 | 
TABLE LXXI. 
Left Index and Ring Finger. 
Left Index. 
{ SL | LL W Cv Totals 
A 50 14 l 0 66 
SL 154 | 355 31 22 21 583 
LL 72 | 326 | 141 130 43 712 
W 19 93 85 253 41 491 
C 18 45 24 | 32 29 148 
Totals | 313 | 833 | 282 | 437 | 135 2000 
et Bosal ee ES een 
TABLE LXXIZI. 
Left Index and Little Finger. 
Left Index. 
— a ted ee A 
i? ae fb ee C | Totals | 
A 25 8 1 Bl; a 35 
SL 215 526 87 88 | 43 959 | 
LL 64 53 | 164 | 215 | 72 768 
W 6 26 19 85 | 14 150 | 
C 3 20 ll 48 6 88 
Totals | 313 | 833 | 282 | 437 | 135 2000 | 




























TABLE LXXIII. 


Association of Finger-Prints 


































































































Left Middle and Ring Fingers. 
Left Middle Finger. 
r A | SL | LL | Ww | © | Totals 
0 
R A 52 ee ee. 0 0 66 
Fi | SZ | 119 | 421 | 3 2 6 583 
wo} LL 35 | 311 | 308 38 | 20 712 
S| Ww 7 | 88 | 160 | 176 | 60 491 | 
a) ¢o 2 | 54 | “52 24 | 16 148 | 
& 
| Totals} 215 | 887 | 556 | 240 | 102 | 2000 
TABLE LXXIV. 
Left Middle and Little Fingers. 
Left Middle Finger. 
st | A sz | LL | w | © | Totals | 
rot) ! u | | 
A A 2 | 1 0 1 | 0 35 
P| sL | 149 | 578 | 174 | 43 | 15 959 | 
2| LL | 39 | 256 | 306 | 110 | 57 768 | 
3| WwW 2 | 25 46 58 19 150 
| o¢ 2 | 17 | 30 28 | 1 88 
3 | Totals | 215 | 887 | 556 | 240 | 102 2000 
L ! L 
TABLE LXXV. 
Left Ring and Little Fingers. 
Left Ring Finger. 
g | | A | SL | LL | W | @ | Totals | 
=| A 17 13 4] @ 1 35 
me 48 474 | 294 | 99 | 44 959 
S| LL 1 92 390 | 212 | 73 768 | 
2 | W 0 3 | 12 120 15 150 | 
_ C O° Tt Ph | 60 15 88 
| 
3 | Totals] 66 | 583 | 712 | 491 | 148 | 2000 | 
TABLE LXXVI. 
Right Thumb and Left Thumb. 
Right Thumb. 
4 | st | tL | w | co | Totals | 
| l i 
ao) 
g| A 31 44 | 10 2 4 91 | 
s| SL 13 | 204 | 246 53 | 31 547 | 
| LL 0 | 30 | 468 | 180 | 89 767 | 
AS ee 1 3 4 270 23 341 | 
o C 1 9 | 47 144 53 254 | 
5 | 
815 | 2000 | 

















Left Index. 


Left Ring Finger. 


Left Little Finger. 


H. Walt 


TABLE LXXVILI. 
Right Thumb and Left Index. 


Right Thumb. 




















| A | so | LL | W c | Totals | 
ae a7 | 94 | 123 41 | 98 313 | 
| 8 16 | 150 | 420 181 66 833 | 
ee 1 | 17 | 118 | 105 44 282 | 
W 1 | a7 | us | 264 40 437 
: 1] 2 | 4 | 58 | 22 135 | 
Totals | 46 | 290 | 815 | 649 | 200 J 2000 | 








Left Middle Finger. 


TABLE LXXVIII. 
Right Thumb and Left Middle Finger. 


Right Thumb. 









































f=} 
4 | sz | 2L°| w | © | Totals 
A 23 | 60 | 86 a2 | 14 215 
| SL 17. | 167 | 414 | 210 | 79 887 
ae o | 40 | 232 | 214 | 70 556 
Ww 5 | 18 57 | 141 19 240 
ae Sh ex 26 52 18 102 
| 
| | | 
Totals | 46 | 290 | 815 | 649 | 200 | 2000 | 
TABLE LXXIX. 
Right Thumb and Left Ring Finger. 
Right Thumb. 
4 | SL | LL w | © | Totals | 
A 8 32 | 19 7.4 66 | 
SL 23 145 | 276 | 98 | 41 583 | 
LL 7 69 | 319 | 229 | 88 712 | 
W 3 29 | 142 | 261 56 491 
C 5 15 59 | 55 14 148 | 
Totals | 46 | 290 | 815 | 649 | 200 | 2000 











TABLE LXXX. 


Right Thumb and Left Little Finger. 
Right Thumb. 














A | SL LL TY | @¢ Totals 

A “1 ais 2/1 0 35 

SL 29 | 207 468 181 74 959 

LL 2 54 275 341 96 768 

W 3 8 | 40 80 19 150 

| © Se ae 2% | 45 | 11 88 

| Totals | 46 | 290 | 815 | 649 | 200 | 2000 
! 1 






















































TABLE LXXXI. 
Right Index and Left Thumb. 


Right Index. 


Association of Finger-Prints 





‘| sz 


Totals 





















































































Cae 
Bo | — 
Py ae ee 10 2 3 91 
2 SL 154 241 61 65 26 547 
= LL 101 266 158 166 76 767 | 
o W 21 58 69 | 151 42 341 | 
ra C 29 60 41 | 97 | 27 254 | 
tm] | 
Totals } 352 | 654 | 339 | 481 | 174 2000 
TABLE LXXXII. 
Right Index and Left Index. 
Right Index. 
A SL, SL, LL, | Se ae. C Totals 
ae 185 42 58 4 WD lee te 313 
s SL, 65 103 | 7% 23 9 11 9 295 
 c SL,, 83 88 187 29 46 59 46 538 
i LL, 4 11 4 29 2 25 13 88 
2 LL,, 7 14 23 16 64 | 47 23 194 
3 W 2 11 13 49 24 287 51 437 
C 6 13 12 24 10 43 | 2 135 
Totals | 352 | 282 372 | 174 | 165 481 | 174 2000 
TABLE LXXXIII. 
Right Index and Left Middle Finger. 
Right Index. 
5 4 | SL | LL | Ww | © | Totals 
oA i 
=, A 108 | 87 9 j ae 215 
© SL 198 425 109 103 | 52 887 
= LL 36 106 168 170 | 76 556 
A) W 4 21 3 154 27 240 
s C 6 15 19 47 15 102 | 
3 Totals 352 ; 654 sf 339 | Bao | 2 
TABLE LXXXIV. 
Right Index and Left Ring Finger. 
Right Index. 
he A SL | LL | W C Totals 
= 
op 
=| 
pe 
2 | 
® | 
_ 





























Left Little Finger. 


Left Thumb. 


Left Index. 


Biometrika x 


H. Warr 


TABLE LXXXV. 
Right Index and Left Little Finger. 
Right Index. 





Sa SL .| 






























































Left Middle Finger. 








LL Ww | Cc | Totals | 
A 24 | 10 0 0 1 35 
SL 236 435 116 122 50 959 
LL 81 183 176 224 104 768 
W 5 21 30 86 | 8 150 
C 6 5 17 49 ll 88 
Totals | 352 | 654 | 339 | 481 | 174 | 2000 
TABLE LXXXVI. 
Right Middle Finger and Left Thumb. 
Right Middle Finger. 
A SL LL W C Totals 
| 
A 36 43 6 oe 91 | 
SL 97 311 93 32 14 547 
LL 47 374 235 90 21 767 
W ll | 105 111 95 19 341 
C 21 | 88 71 53 21 254 
Totals 212 921 516 274 77 2000 
TABLE LXXXVII. 
Right Middle Finger and Left Index. 
Right Middle Finger. 
{ SL LL W C Totals 
A 110 176 21 1 5 313 
SL 86 529 177 24 17 833 
LL 7 95 125 42 13 282 
W a] 80 144 180 28 437 
C 4 41 49 27 14 135 
Totals | 212 921 516 274 77 2000 
TABLE LXXXVIII. 
Right Middle and Left Middle Fingers. 
Right Middle Finger. 
A a | es Ww | C | Totals | 
A 115 94 4 0 2 215 | 
SL 84 | 635 129 25 14 887 
LL 8 152 295 70 31 556 
W 3 | -20 65 140 12 240 
C 2 20 23 39 | 18 102 
Totals | 212 | 921 | 516. | 274 77 2000 





















Left Little Finger. 


Left Thumb. 


Left Index. 


Left Ring Finger. 


Right Middle and Left Ring Fingers. 


Association of Finger-Prints 


TABLE LXXXIX. 


Right Middle Finger. 


SL 







































































A yA Fad ered C Totals | 
| A 49 16 1 0 0 66 
SL | 110 | 398 59 8 8 583 | 
LL 39 | 346 | 241 63 23 712 | 
W 9 | 102 | 165 | 184 | 31 491 | 
C 5 | 59 50 | 19 15 148 
Totals} 212 | 921 | 516 | 274 | 77 | 2000 
TABLE XC. 
Right Middle and Left Little Fingers. 
Right Middle Finger. 
| A SL LL ae C Totals 
eo oT wr 6 6 Ts 35 
Le 148 | 571 | 168 53 19 959 
LL 35 | 289 | 260 | 137 47 768 
W 4 30 | 57 56 3 150 
C 3 18 | 3l 28 8 88 
| | 
Totals | 212 | 921 | 516 | 274 | 77 | 2000 
TABLE XCI. 
Right Ring Finger and Left Thumb. 
Right Ring Finger. 
A ma iat ww | © | Totals | 
A 15 38 13 ‘ae eee 91 | 
SL 34 231 | 112 | 121 | 49 547 | 
LL 7 | 167 | 258 | 262 | 73 | 767 | 
Ww 3 28 88 | 199 | 23 341 | 
C 4 25 | 72 | 130 | 23 254 
Totals |} 63 489 | 543 | 729 176 2000 
TABLE XCII. 
Right Ring Finger and Left Index. 
Right Ring Finger. 
A | SL | LL | Ww | © | Totals | 
| | 
A 41 | 141 71 39 | 21 313 
SL 20 | 309 | 244 | 183 77 833 | 
| LL 1 13 | 112 | 129 27 282 | 
| W 0 15 | 79 | 307 36 437 | 
C 1 ll | 37 71 15 135 | 
| 
| Totals 63 176 2000 
| 




















H. Warre 


TABLE XCIII. 


Right Ring and Left Middle 
Right Ring Finger. 


Fingers. 
































5 21a toe 
oe “ee 
es 7 | 122 28 13 5 | 215 
Xe SL 15 | 336 | 274 | 179 83 887 
> LL 24 204 | 266 61 556 
3o| W o | 5 | 26 | 194 15 240 
S C Bee ee 77 12 102 
2 
3 Totals | 63 489 | 543 | 729 | 176 | 2000 
TABLE XCIV. 
Right Ring and Left Ring Fingers. 
Right Ring Finger. 
m A SL LL W C Totals | 
3. | 
=| fA 37 | 26 2 0 1 66 
al SL 20 | 363 | 114 45 41 583 
on LL re ie 367 183 77 712 
ss W Lf 9 30 423 28 491 
a) ¢ 1 | 10 30 78 29 148 
s | 
1 | Totals 63 | 489 543 | 729 176 2000 








TABLE XCV. 
Right Ring and Left Little 
Right Ring Finger. 





heer See Sa Se: 






































i C Totals 
= | 
= a 17 15 3 0 0 35 
| SL 42 399 50 «| «194 74 959 
S| LL 4 68 276 | 339 81 768 
oa W 0 5 9 125 ll 150 
ei ¢ Oo | wd 5 71 10 88° 
2 | 
5S | Totals | 63 489 | 543 | 729 | 176 | 2000 
TABLE XCVI. 
Right Little Finger and Left Thumb. 
Right Little Finger. 
A SL | LL Ww C | Totals | 
. . | 
so 
q| <A 8 60 13 5 5 91 
s| SL 16 345 | 144 24 18 547 
=| LL 2 | 320 330 | 81 34 767 
te 3 77 | 165 | 71 25 341 
ice oe ee Tt an Vinee 75 25 254 
= 
Totals 




































Association of Finger-Prints 


TABLE XCVII. 
Right Little Finger and Left Index. 
Right Little Finger. 











A SL LL | W C Totals 
J | i 

wi A os ok ee oe ee 
| SL 8 | 477 | 269 | 44 | 35 833 
— | LE 0 76 | 155 34 | 17 282 
2 W l 73 | 198 128 | 37 437 
oe 0 40 66 | 16 | 13 135 
Totals | 31 870 | 764 | 228 | 107 | 2000 














TABLE XCVIII. 
Right Lnttle and Left Middle Fingers. 
Right Little Finger. 


















































iB A SL LL Ww C | Totals 
oe 
el a 17 | 158 35 2 3 215 
Fs SL 12 507 291 42 35 887 
oz LL 0 152 301 71 | 32 556 
ss W 2 39 89 8&4 | 26 240 
Ss ¢e 0 | 14 48 29 | 11 102 

| 
e Totals | 31 870 | 764 | 228 | 107 | 2000 

TABLE XCIX. 
Right Little and Left Ring Fingers. 
Right Little Finger. 
by A SL LL W C Totals 
L | | | 
of } 
= A i 0 1 66 
— SL 15 440 106 9 | 13 583 
oO LL 2 255 298 | 30 27 712 
5 W 1 89 178 173 50 491 
_ C 1 | 36 | 79 16 16 148 
2 
o 
| Totals | 31 870 764 | 228 107 2000 
TABLE C. 
Right Little and Left Little Fingers. 
Right Little Finger. 

5 4 | SL | LL | Ww | © | Totals 
op 
=z | (A 18 | 17 0 0 0 35 | 
ra SL 11 | 721 176 | 21 | 30 959 
<= LL 1 115 543 74 35 768 | 
Po W 1 13 22 99 15 150 | 
= C Go}: 44 23 34 27 88 | 
2 | | | | 
| Totals} 31 | 870 | 764 | 228 | 107 | 2000 


























ON THE PROBLEM OF SEXING OSTEOMETRIC 
MATERIAL 


By KARL PEARSON, F.R.S. 


It is well known that anthropometric, particularly craniometric measurements 
give frequency series, which for moderate sized populations follow closely the normal 
or Laplace-Gaussian distribution. Measurements of stature, cubit head-length, 
cephalic index, etc., etc., obey with sufficient accuracy for most purposes of science 
the normal law, This statement may with a high degree of certainty be extended 
to practically almost all measurements on the adult skeleton. But a new difficulty 
arises in dealing with the parts of the skeleton: the sexing of the several bones of 
the human body is by no means certain, and this is especially the case when we 
come to deal—not with the cranium or the pelvis but with the long bones. In 
order to get over this difficulty, and to find the constants for each sex, it occurred 
to me some years back when the sexing of the long bones had presented this 
problem very forcibly to workers in my laboratory, that the method of my first 
contribution to the mathematical theory of evolution* might be applied. Namely, 
we might take the unsexed material and assume it to consist of a compound of 
male and female data, the frequency curve for each of these being normal ; the 
two components might then be found in the manner of the paper just referred to. 
The method was especially likely to be successful, when the series was otherwise 
homogeneous, the numbers large and the character dealt with substantially diffe- 
rentiated sexually. Of course the method does not give the sex of each individual 
bone, but I have shown in another memoirt, how four to six characters thus 
resolved form a basis for determining the probable sex of each bone, and this with 
an accuracy which is very probably as great as, or even greater than, anatomical 
appreciation unbased on a system of numerical measurement. 


One of the few objections to the method is the labour involved in the process, 


While the analysis required in the application of the method is not so severe that 
it has not been applied in a large number of cases by workers in the Biometric 


* Phil. Trans. Vol. 185, A, pp. 71—110. 
+ To appear in the next number of this Journal. 























480 On the Problem of Sexing Osteometric Material 


Laboratory, it is still considerably beyond the powers of most of the present 
workers in anthropometry, and probably no anatomist of the present day has the 
mathematical knowledge requisite for the solution of the reducing nonic, or the 
arithmetical patience required for the calculation of its coefficients. It has occurred 
to me, however, that the work might be considerably shortened by the following 
considerations. The bones usually dealt with are those found in ancient cemeteries, 
in plague pits, clearance pits or crypts. It is probable, though by no means 
certain, that adult female bones in such cases would be rather more numerous 
than male. On the other hand being somewhat smaller they are asserted by some 
writers as likely to be more frequently broken, and they certainly may more readily 
escape preservation or measurement. If we take these two causes as counter- 
acting each other, we may assume as a first approximation that the numbers of 
male and female bones will be equal. In the next place it is a result of much 
anthropometric experience that male and female variations, i.e. their standard de- 
viations, are closely alike. These again we can take equal to a first approximation. 
Accordingly, to this first approximation, our osteometric series may be considered 
to consist of two equal normal components with different means. Let the mean 
of the unsexed material be M, and let the actual means of the sexed components be 
m,, M2, their standard deviations be o,, cz, and their total frequencies n, and ng, 
where the subscript 1 refers, say, to the males, and 2 to the females. Then m, 
Mz, 71, Tz, % and mn, are the quantities we desire to discover. Let the moment- 
coefficients of the total material be, in the usual notation, w., 3, My, @s and let 
N (=m+%) be the total unsexed population. We shall write as customary 
By = w/e, Bo= Ma! Me', Bs = sfts/mo. Then, if our hypothesis be correct and the 
material consist very nearly of two equal normal distributions, 8, and 8, ought 
to be very small, while 8, will be large in relation to them. 


It is convenient also to write: 


¢ =4(8-8,), Re, MRO a Ble - insoa'sverancbacedecceveccee (i), 
m=M+, hg SNE Va code vncnese vanccuniedents hens (ii), 
Jo = No/ Pe, Gs = (Ya + Ye)/W/ fg -ccreccccrccecceccees (ili), 

WER ok caitiie oon cod tcagitacapansuesuatowureumeds (iv). 


Then the fundamental nonic may be written : 


Led 7 3 3 - Yo 5 Ld 3 2) 
qz’ — 70,90" + $ Bi ge? —3 (0, — 56) q2 — (37A.f, + is q:' 
ry 


+3 (48° — 30,0.— 3f,°) ge +3 (Bio —- 3 B,0,?) qe’ + 8B, — Bi =0...(v). 


VB, {8; ~ 66,92 — : = os age 





Further: = esc a 6” sesame ec a .(vi), 
where the sign of ./f, is determined by that of ys. 
Again _,-—" N, n= EN Saenger (vii). 


eed i) adhe 














KarL PEARSON 


Lastly : oy = pf, (1 + qe) — 4 bs/"Y2 — 4 V bars \ 

of = py (1 + G2) — $ ol — 4 V mye | 

Equations (ii), (iii), (iv), (v), (vi), (vii) and (viii) form the complete solution of 
the problem when we make no approximations whatever*. 


If, however, 8, = 8, =0, then, the two components being equal, we have+: 


i=) mH su. | 
y= = V fy ot Sateutat cok aac sae an (ix). 
ig < Veh = VEN I 


It will be seen that it is needful in order that the solution may be real that £, 
should be positive or B, < 3, ie. the total frequency should be platykurtic. Now 
let us suppose that the values given by (ix) are a first approximation and that we 
need a second approximation in which the two normal curves will be unequal in 
frequency, mean and standard deviation. Write: 


n=3N, y=VmGt, c=Vun(1—VG)P.... ee (ix), 
and suppose : 
nm =n+ dn, N= n+O6ne, 
N=7+6n, yw=—yt oy, 
o,=0+ b0,, o,= o+80,, 


where the differentials represent small quantities of which the squares and 
products may be neglected to a second approximation. 
Our equations are: 
m+n=N, 
1 + Noo = 0, 
My (M1? + Oy) + Ng (2? + 72") = Nuz, 
n, (92 + By,0;7) + Ny (ys? + 3y,022) = Nus, 
my (y:4 + 6y20,? + 30,4) + Ny (yo! + Gy2o2 + 30,4) = Nyy, 
ny (9° + 10y,2 0,7 + L5y, 0,4) + Ne (ye? + 10y.2 02" + L5y20') = Nuss. 
We now differentiate these and after differentiation put 


1,=N=N, n=—-yw=”y oO, =0,=0. 
Hence we find: 
SRS Sit) RR cco lnc ccaae evan chen auee Sete aeke (x), 
(Sry, + Bye) + WSN, =O. cececeecsceeeeeceeen ees (xi), 
2nry (Sy, — Sy2) + no (So, + Soy) =0......020. 0 a (xii), 
3n (Sy, + Syz2) (y? + 3°) + 28m (y? + 30°) + Gnoy (Se, — 80.) = Ny... ..(xiii), 
¥y (97? + 80°) (Sy, — Sy.) + 8a (ry? + 0”) (8a, + 8a.) =0 «0... ee eee (xiv), 
n (Sy, + Syo) (Sy! + 80 y2o? + 150%) + 2Wnyy (9 + 10y?o? + 150") 
+n (do, — S02) 20ya (7? + 307) = Nys ....... 00s. (xv), 


* They are, in a somewhat better form, those originally given by me in Phil. Trans. Vol. 185, A, 
1894, pp. 71—110 ; see Equations (14), (15), (18), (19), (27) and (29) of that memoir. 

+ Loc, cit. footnote, p. 91. 
t Loc, cit. p. 82. 














482 On the Problem of Sexing Osteometric Material 


where it must be remembered that the differential terms are introduced solely to 
account for the asymmetry as represented by pw, and u;, assumed to be zero to 
a first approximation. 


But (xii) and (xiv) show us that we must have: 


dy, = dye, 60, = —So;. 


Hence from (xi): Bilg O eAINE ia hand ans Avvanepaickccevesse 43 (xvi). 
(xiii) now becomes : Qn? Sy, + Gordo, = bs, 
and (xv): dry? Sey, (y? + 5o*) + 20yo (y? + 30°) 80, = ps. 
Whence solving we find: 
\n, Bp ; 
= d= (1+: =) e- 5% pa prepeet eos KV 
by, Ye ; (1 +3 ye) 8 x (xvil), 
\u, 1 
be, = — 8e,=— (1 "4 eo, ee 
o; 8a. 4 +57, yt Bey (xvili) 
Tn es bites ieee eee || 
ij 


These form together with (ix)"* the complete solution of the problem. 


The following example illustrates the procedure: 541 measurements were made 
of the bicondylar width of English femora, right and left, male and female being 
mixed. The frequency below resulted. 


Frequency Distribution of 541 Femora for Bicondylar Width. 








| 
mm. | Frequency mm, Frequency mm, Frequency 
61 1 71 23 81 28 
62 1 72 33°5 82 23 
63 15 73 25 83 19 
64 5 74 22 84 17°5 
65 13°5 75 | 36 85 19°5 
66 14 76 25°5 86 16°5 
67 | 15°5 77 29°5 87 7°5 
68 } 22 78 32°5 88 3 
69 31 79 19°5 89 3°5 
70 19 80 | 33 90 0°5 





The constants of this distribution were: 
M = 758152, 
Py = 37°692,112, ps3 = — 2°587,693, 
1, = 3020'893,695, z, = — 83-260,992. 
Hence we deduce: 
B,= 000,125,047, 8, = "000,106,750, 
8. = 2°126,349, C, = °436,8255, 
¢, = 001,143,72. 











Kart PEARSON 483 


Clearly 8, and 8; are so small that the distribution fulfils our condition of 
being very closely symmetrical. The nonic, equation (v) above, is: 
q:? — 3:057,789q,7 + °000,187579.° + 2°858,817¢,° 
— ‘009,8679.* — °754,678q.? — 002,501 1g." 
+ '000,000,05469, — °000,000,000,005 = 0, 
the last two terms being written down to many figures to show their inappreciable- 
ness. The root required is: 


q2= — 65679, 
which by (vi) leads to: 


ry? + °558,050y — 24°755,802 = 0, 


and provides the solution: 


Females. Males. 
Mean: 70°547 mm. 80°526 mm. 
Total Frequency : 255°4 285°6 eet (A). 
Standard Deviation: 3°4842 mm. 3°6944 mm. | 
Modal Ordinate*: 29°24 30°84 


We have now to inquire how far the same result would be reached, if we had 
supposed as a first approximation equal Gaussian components and then proceeded 
to determine a second approximation by aid of (xvii) to (xix). 


Equations (ix) give us: 
R= n=n= 2105, 
W=— y= = £9912, 


j= 6,=3°5750. 


Thus to a first approximation : 


Females. Males. 
Mean : 70°824 mm. 80°806 mm. | 
Total Frequency : 270°5 270°5 } baa aE” (B). 
Standard Deviation: 3°5750 mm. 35750 mm. | 
Modal Ordinate : 30°19 30°19 


(B), statistically speaking, is so close to (A) that it gives every confidence of 
a second approximation practically reproducing (A). 


We find: 
M __.920,8112, “= —-026,0386, 
so x 
8 2 
7 = °*513,2871, Y — 1-948,228. 
i a 


n 
* yo=— >= of the normal curve. 
NV 210 


Biometrika x 


























484 On the Problem of Sexing Osteometric Material 


Hence by (xvii) to (xix): 
by,= Sy,.=— 7 x 056,308 = — *2810, 
8o,=—80,= o x ‘029,809= 1066, 
én, = — Ong = + 2 X 056,308 = 15'231. 
It will be seen from these results that: 


by; = — ‘0563, bo, = *()298, on, = 0563 
Cc 


may be considered fairly small quantities, and that they justify our assumption. 
We have accordingly : 


Females. Males. 
Mean: 70°543 mm. 80°525 mm. 
Total Frequency : 255°27 285°73 Oh 5 ia (C). 
Standard Deviation: 3°4684 mm. 3°6816 mm. 
Modal Ordinate : 29°36 30°96 


It is clear that the solutions (C) and (A) are for all practical purposes identical. 
Thus the short method is justified in the problem of sexing osteometric material. 
An improper extension of the method to material in which the sexes occur in very 
unequal groups may be guarded against by simply observing whether , and A, 
are very small quantities. 


In conclusion it may be desirable to compare the values of these sex-constants 
as found mathematically with sexing by anatomical appreciation. I owe an 
anatomical sexing of the same bones to my colleague, Dr Derry. 


The following values of the constants resulted : 


Females. Males. 
Mean: 70098 mm. 79°764 mm. | 
Total Frequency : 221 320 SS (D). 
Standard Deviation: 3°5148 mm. 4°1254 mm. | 
Modal Ordinate : 24°55 30°95 | 


It will be seen that the mathematically deduced constants are not widely 
divergent from those obtained anatomically, but the accordance if fair is not ideal. 
The accompanying diagram exhibits the differences in the frequency distributions 
found by the two methods of sexing. The chief difference lies in the transfer by 
the anatomist of the larger female bones of the mathematical sexing to the male 
group. I do not propose to discuss here the relative advantages of the two 
methods, but would draw attention to a few points of interest : 


(i) The solution (D) makes no appeal to measurement in the sexing, it is 
based purely on an anatomical appreciation. It would therefore be subject to 

















‘BUIxXog [BOIMIOJVUY pUB [vOIYBMOTYVAY JO uosIAvdmOgQ ‘“qjpIA Ae[Apuoolg ‘BOWE Angus YIL[ UOpuOry 


‘mum UL YIpr_y Lnjphpuorng 
16 06 68 88 28 98 S8 v8 £8 ZB L8 os 62 82 ZL 92 GL BL SL GL OZ 65 : : 09 69 8g 
4 1 1 4 4 4 4 4 4 4 1 1 4. — 





° 
* 
* 


| 


| ee 
| | 








SS 
a 


- 
- 


— a 








*houanbasq 











@008006> 














ujep pauquoQ — 
Buixag sal0g IV eeee 
jeormojyeuy |seuoq ojemag , 


PMN PHIQUoD OOD 
Burxag Sayvyy toy uvissnen ggg 
jwonemayjep | sapeutay toy wer : 





























mm. °] | 3 mm. 
61 1 —_ 71 
62 1 — 72 
63 1°5 —_ 73 
64 5 —_— 74 
65 13°5 — 75 
66 14 — 76 
67 15°5 —_— 77 
68 22 —- 78 
69 31 — 79 

| 70 15°5 | 3°5 80 


Mathematical Sexing. 
9492 ff 457 


: 
|_ 
| 


? 


Es 





sewed by Anatomical Appreciation. 


nm, 


| 
| 
| 
| 
| 
| 


? 


course apart from errors in arithmetic or from the number of decimal places 
retained in the working. It eliminates the factor of personal equation. 


(ii) (C) would, however, be influenced by the fact that our material is not 
perfectly homogeneous except for sex ; because (a) there is a mixture of right and 
left bones, and, to judge by the anatomical sexing, this may involve a difference of 
‘7 to ‘9 mm. in the means and ‘08 to ‘24mm. in the standard deviations; this 
would add to the heterogeneity, (b) our bones may be due to somewhat mixed 
classes and possibly mixed periods, (c) the bicondylar width is liable to be injured 
by rough treatment of the bone, and this injury will most affect the weaker, and 
therefore probably the younger, bones. These bones might then be treated as female, 
a classification which most anatomical sexing also favours. 
of these London femora is nearly 800, the bicondylar width could only be measured 
in 541 cases. This selection will not necessarily be random as to size or sex, 
and may modify our constants found mathematically from the distribution. On 
the other hand it would affect also the anatomical appreciation of sex, but only 
in as far as it was based on the size of the condyles. 


While the total number 


(iii) We know from very considerable sexed data that the variation of man 
and woman is very nearly the same. The coefficients of variation measured in the 
usual way, i.e. by 100 standard deviation divided by mean, gave: 


Anatomical Sexing. 


Sf 517 


@ 501 
A=-'16 
























486 On the Problem of Sexing Osteometric Material 


personal equation, depending on the features upon which the experience of the 
individual anatomist leads him to lay most stress. 
that is to say, given the same data, all statisticians would reach the same values, of 


The solution (C) is unique, 


Frequency Distributions of Bicondylar Width in Male and Female Femora 

















KarL PEARSON 487 


There was thus closer sexual accord from the anatomical method. But when 
the same anatomical sexing was applied to the character of the head of the femur 
in the vertical plane, I found for right bones: 


Q 5-05 J 637 A =—1°32, 
and for left bones: 
2 491 J 610 A=-119, 


differences far greater than occur in the mathematical sexing from the bicondylar 
widths. Accordingly no great stress can be laid on inequalities in the coefficients 
of variation deduced from either process of sexing. 


It would appear to me that we have reached on the whole a reasonable 
biometric method of sexing. To what extent it can replace the sexing by 
anatomical appreciation must be left to the future. But it is clear that when 
anatomists themselves prefer to that appreciation an appeal to a single character, 
e.g. to the measurement of the femoral head, and only settle by anatomical appre- 
ciation the sex of femora with diameters between 45 and 47 mm., then they do 
not show much confidence in their own method of sexing. An interesting experi- 
ment could be made if some 400 to 500 sexed bones were available, and then, 
without knowledge of the real sex, two or three anatomists and a statistician were 
to be asked independently to determine the mean and variability of two or three 
characters of the bones of each sex in this material. 


I have cordially to acknowledge the help of my colleague Mr E. Soper in the 
determination of equations (xvii)—(xix) and in their solution (C) in the numerical 
case for which I had reached the solution (A); also the labour of my colleague 
Miss H. Gertrude Jones in the preparation of the diagram which contrasts 
graphically the mathematical and anatomical solutions of the problem. 











FURTHER EVIDENCE OF NATURAL SELECTION 
IN MAN. 


By ETHEL M. ELDERTON, Galton Research Fellow, 
AND KARL PEARSON, F.RS. 


(1) The second author of the present paper writing in 1894 a commentary on 
the statement that “no man, as far as we know, has ever seen natural selection at 
work,” remarked : “ Every man who has lived through a hard winter, every man 
who has examined a mortality table, every man who has studied the history of 
nations has probably seen natural selection at work*.” The emphasis is here to 
be laid on the word “probably,” because the seeing depends on the power and 
validity of the scientific means adopted to analyse the observed facts. In a paper 
communicated by the same author to the Royal Society in June 1912+, it was 
shown from the Registrar-General’s series of ten yearly life-tables that when 
allowance was made for change of environment in the course of the fifty years a 
very high association existed between the deaths in the first year of life and 
the deaths in childhood (1 to 5 years). This association was such that if the 
infantile deathrate increased by 10°/, the child deathrate decreased by 5:3°/, in 
males, while in females the fall in the child deathrate was almost 1°/, for every 
rise of 1°/, in the infantile deathrate. The method of investigating by life-tables 
could not be extended beyond 1900, because the life-tables for the next ten 
years (1901-1910) were not then out, and indeed have only just appeared 
(December 1914). While the infantile deathrate as shown from the life-tables 
had risen from 1871-1900, the child deathrate had fallen for the same period. 
During the next decade 1900-1910 both deathrates have fallen together; such 
a secular change does not in any way modify the argument of the paper, which 
lies in the statement that whether two deathrates rise together or rise and fall 
simultaneously we can draw no inferences at all, wntil they have been corrected for 
secular change. Most economic, demographic and physical variates are changing 
continuously with time, and no comparison of time graphs or calculation of 
correlations will demonstrate of necessity anything but spurious association, until 


* The Chances of Death and other Studies in Evolution, Vol. 1. p. 166. 
+ ‘*The Intensity of Naturai Selection in Man.” R. S. Proc. B. Vol. 85, pp. 469—476. 











Erne, M. Evperton anD Kari PEARSON 489 


the time factor has been eliminated. It is the deviations from the continuous 
curves of secular change which may turn out on careful analysis to be truly 
indicative of causal relationship between the variates under consideration. 


The first attempt to get rid of secular change by a method of differences was 
made by Miss F. E. Cave in 1904 in a paper on barometric correlations*, and 
shortly afterwards Mr R. H. Hooker published a paper dealing with the same 
point+. Both these authors used only first differences and gave no general theory 
of the method. Quite recently “Student” has published a papert giving the 
fundamental formulae, and indicating how by taking successive differences of two 
variates and correlating them, we free ourselves from the time or locality influence, 
and approach the true and probably causal relationship between them. When the 
correlation of the differences becomes steady, then we have reached the actual 
correlation of the variates corrected for the time factor, provided an assumption is 
made which we shall discuss at greater length below: see footnote, p. 495. Mean- 
while Dr Anderson of Petrograd has been working on the subject, and in a 
most valuable memoir§ he has added to “ Student’s” 
theorems; for example, the probable errors of the successive difference corre- 
lations when they become steady, and the relations which should be fulfilled 
between the squares of the standard deviations of successive differences, when 


results a number of new 


the series has become steady. We have thus a double means of ascertaining 
whether the desired object—the elimination of the time-factor—has been approxi- 
mately achieved. A third additional test will be indicated in this paper. 


This new statistical process has been termed the Variate Difference Correlation 
Method||, and there is small doubt that it is the most important contribution to 
the apparatus of statistical research which has been made for a number of years 
past. Its field of application to physical problems alone seems inexhaustible. We 
are no longer limited to the method of partial correlation, nor compelled to seek 
for factors which rendered constant will remove the changing influence of environ- 
ment. In the present case, that of the influence of infantile mortality on child 
mortality, Pearson endeavoured to eliminate the influence of continual environ- 
mental improvement by making the expectation of life at six years constant. 
Snow achieved the same object by correlating the deathrates of one sex for a 
constant deathrate of the other**. In both these cases substantial evidence of 
Natural Selection was obtained from the mortality tables. The object of the 
present paper is to demonstrate by the still more complete elimination of the 

* R. S. Proc. Vol. uxxtv. pp. 407 et seq. 

+ Royal Statistical Society Journal, Vol. uxvi1. pp. 396 et seq. 1905. 

t+ Biometrika, Vol. x. pp. 179, 180. 

§ Ibid. pp. 269—279. 

Pearson and Cave: ‘‘ Numerical Illustrations of the Variate Difference Correlation Method.” 
Biometrika, Vol. x. pp. 340—355. 
{ R. S. Proc. B. Vol. 85, p. 472. 
** «* The Intensity of Natural Selection in Man.” Drapers’ Company Research Memoirs, Dulau & Co., 
1911, 

















490 Further Evidence of Natural Selection in Man 


time factor involved in the variate difference correlation method that a selective 
deathrate plays even in highly civilised states a marked part in the natural history 
of man. 


(2) The material dealt with in this investigation consists of the Registrar- 
General’s returns for births in England and Wales and of deaths in the first five 
years of life from 1859 to 1908 with the addition of as many years before 1859 
as were requisite to make our highest differences fifty in number, and with the 
addition of as many years after 1908 as were requisite for following up the births 
of that year to the fifth year of life. Thus actually our data extended from 1850 
to 1912. The reason for this procedure lies in the desirability of using a constant 
population, and not reducing by one a relatively small number like 50 on each 
differencing. Asa result of this process we had to modify Dr Anderson’s values 
for the probable errors for the steady values of the difference correlations because 
in our case the size of the population does not change as we proceed to higher 
differences*. The second cause which requires extension of the data is a very 
important one, and must be illustrated numerically. Consider the table: 


Deaths of those born in’a given year. 














| | 

Year | no it | 8-8 1 ee | 4—5 

eee, ee eee | 
1908 | 478,410 | 63,594 ~ — a ae 
iv9 | — sa 14,146 Nees oS ee ee 
1910 | — _ ay A 2 oe se | 
mi |. an = — | 3,449 ee 
1912 | _— —- |; — — | — | 2,341 | 

| | | 





Now the deaths of infants 0O—1 in 1908 are not necessarily of infants all born 
in 1908, but the total deaths 63,594 must represent closely the deaths in the 
478,410 infants born in that year. Disregarding immigration and emigration, this 
gives a deathrate per 1000 of 107'495 and leaves 414,816 children alive. Of this 
group 14,146 may be taken to die in the second year of life, giving a deathrate of 
31-990 per mille. There remain 400,670 children who reach the third year of life 
in 1910, of whom 5,020 die, giving a deathrate of 11:939, and 395,650 survivors. 
These survivors are followed into 1911 and 1912 in the same manner, and thus 
we obtain approximately the deathrate up to the fifth year of the male children 
born in 1908. We thus in bulk follow the same group of children through the 
first five years of life. Tables I and II give the deathrates for males and females 
respectively under the heading of the birth year of each group. These death- 
rates have been taken to three decimals places for the purpose of determining the 
higher differences correctly to one decimal place. The successive differences of 

* All the probable errors of the difference correlations given in this memoir are these modified 


Andersonian values, i.e. they are the probable errors on the assumption that the difference correlations 
have reached steady values. 








TABLE I. 


2 


ut Gs 


QL 


=e 


8 
9 
1860 
1 


D> Cite Co % 


“ 


9 
1890 


9 
1900 





Biometrika x 





in the Year of 1st column. 


159°781 
168°706 
173°324 
174°808 
170°890 
169°151 
156°756 
168°486 
172°591 
167°106 
162°642 
167 °634 
156°684 
163°183 
166°309 
174°356 
173°659 
166°905 
168°064 
169-022 
174°287 
171°840 
162°321 
163°676 
164°976 
173°145 
160°415 
149 °627 
166°266 
149°754 
167°313 
142°532 
153°154 
151°184 
160°381 
151°175 
163°081 
158°243 
150°177 
157°476 
164°757 
163°761 
162°112 
73°333 
149°633 
176°280 
160°989 
170°291 
175°183 
176°606 
168°685 
165°617 
146°791 
144°567 
158°684 
141°193 
144°819 
30°259 
32°928 




















Males. 

I—z 2—3 Sh 
63°935 34092 22-138 
64-977 33°882 | 26°657 
63°533 40936 | 24°316 
74802 35°464 22-298 
60°598 30°740 21°688 
59-696 33-004 29°546 
64°317 39°551 25°594 
67°928 36°777 19°471 
68-712 30°566 19°857 
58887 31-650 22-325 
70-401 35°297 30°083 
65°785 41-390 27°654 
73°848 37°326 22°013 
67 °537 32°774 21°088 
66-462 34:408 17°660 
68-369 28-278 21°473 
60-603 32°159 22-751 
63-790 32-731 23220 
62°988 32°357 20°185 
65°716 30°186 17°450 
62-401 26760 16°344 
59°853 25503 21-635 
55*267 30°754 : 
61°415 28-007 17°275 
60°316 25-232 16:028 
58-422 25-400 18-848 
56-088 27-992 17044 
63-344 26-104 17°171 
58-104 27-853 | 15-280 
66-188 22-245 16°741 
48°18] 25-909 16-036 
59°164 23°333 15°261 
54°365 24°026 14°362 
58-292 23-015 13°823 
53-952 22-077 14°792 
57960 23-052 13°615 
55611 21-009 13-974 
50-713 22°169 14394 
56882 22-958 13°536 
56-279 22338 13°750 
59-098 22-604 14°551 
54255 20-821 12-615 
52°776 19°442 12°497 
47-035 20°343 14°121 
55°787 21°27] 11°750 
51-404 19°535 11-912 
50293 18°742 11802 
50986 19°124 12396 
48227 19-380 11496 
49-837 17-094 11°565 
44°728 17-400 | 9°759 
42-597 15°570 | 10°241 
40°581 16°572 9°587 
45 °517 15°368 | 9°260 
38921 | 15°234 | 107194 
39-326 15-702 8°893 
38 °037 14°222 8°858 
36°615 15159 | 7°643 
34°102 12°529 | 8°717 


ETHEL M. ELprerton anp Karu PEARSON 


19°021 
18°209 
16°022 
16°720 
21°329 
19°780 
13°751 
13°923 
16°268 
21-962 
21°325 
16°236 
14°982 
12°552 
15°991 
17-015 
18°105 
16°085 
12°821 
11°624 
16°052 
14°727 
12°655 
12°502 
13°499 
13°187 
13°263 
11°738 
13°049 
12°115 
11°920 
10°430 
9°416 
11°042 
9°533 
10°073 
10°569 
9°894 
10°150 
10°647 
9°623 
8°757 
10°604 
8°546 
8°439 
8°973 
9°366 
8-608 
8°657 
7°165 
553 
"135 
6°853 
6°949 
6°458 
6°886 
5°655 
*230 
“962 


sI<7 + 


oc 


Deathrates in each Year of Life for groups born 





492 





Further 
TABLE II. 


| 


| 





> Co ~2 DS Cit Oo 


rt Co 2 


6 
mMSDOAONAG 


o> Mit So 


“ 


9 
1900 
1 


2D Tie Co % 





in the Year of 1st column. 


132°414 
143 °346 
123°522 
144°326 
133°535 
140°755 
145:001 
148:000 
139°148 
136°346 
118°479 
1187004 
131°477 
| 114°641 
119°668 
104°487 
107°495 


Females. 


1—2 


62°235 
62°106 
61°812 
71°939 
59°024 
57°140 
61°902 
65°853 
64°513 
55°535 
66°902 
61°860 
70°313 
63°438 
63°395 
66°401 
58°180 
61°641 
58°624 
60°720 


50°057 
54°048 
49°967 
53°415 
51°103 
46°402 
53°369 
53°307 
54-850 
51°567 
49°387 
44°511 
52°267 
49-339 
46°691 
47-998 
44-832 
45-928 
41-901 
39°527 
37-064 
42-270 
36°598 
37°084 
36°006 
33°904 
31°990 





34°147 
33°992 
39°787 
35°186 
30°863 
34°057 
39°758 
36°712 
29°716 
32°024 
36°060 
41°243 
36°543 
32°637 
34°846 
28°484 
32°392 
33°243 
31°892 
31°'004 
26°855 
25°062 
30°213 
27°549 
25°498 
24°563 
27°976 
25°135 
27°646 
21°792 
25°319 
22°443 
23°539 
22°948 
21°259 
22°740 
19°985 
21°538 
22°456 
21°783 
21°607 
20°841 
19°174 
19°842 
21°499 
18°643 
18°439 
18°109 
18°436 
16°954 
16°902 
15°156 
16°168 
14°774 
14°136 
15°066 
13°552 
13°789 
11°939 





22 °625 
27 °087 
23°805 
22°686 
22°226 
29°375 
26°146 
19°990 
20°796 
24°485 
29°941 
27°727 
21°905 
21°623 
18°217 
21°437 
23°086 
23°081 
20°060 
18°108 
16°235 
21°493 
18°454 
16°950 
16°031 
18°848 
16°685 
17°441 
15°186 
16°911 
15°556 
15°359 
14°178 
13°643 
14°913 
13°251 
13°800 
14°594 
13°926 
14°170 
14°731 
12°646 
12°708 
14°293 
12°183 
11°817 
12°184 
12°739 
11°490 
11°724 

9°712 
10°606 
9°891 
9°315 
9°793 
8710 
9°168 
7°493 
8°546 





19°278 
16°762 
16°148 
16°948 
21°907 
20°501 
14°305 
14°391 
17°089 
21°454 
20°765 
16°287 
15°398 
12°702 
15°547 
17°145 
17°536 
15°653 
12°524 
11°732 
15°196 
13°903 
12°571 
11°421 
13°253 
12°833 
12°651 
11°214 
12°491 
11°880 
11°335 
10°396 
9°586 
10°606 
9°401 
10°030 
10°295 
10°012 
10°146 
10°569 
9°478 
8-900 
10°370 
8°643 
8°300 
8°499 
9°282 
8°642 
8°865 
7°164 
7611 
7°184 
6°792 
7°615 
6°453 
6°902 
5°513 
6071 
5939 





Evidence of Natural Selection in Man 


Deathrates in each Year of Life for groups born 

















Ernet M. ELpDERTON AND Kart PEARSON 493 


these deathrates up to the sixth and, in a few cases, to the tenth were then 
formed. In our notation m, is the deathrate in the rth year of life, ie. from r—1 
to r years of age, and 8,m, is the sth difference of this deathrate. As we have five 
deathrates for each sex this involves 10 means, 10 standard deviations and 20 corre- 
lation coefficients, but as we have used six successive differences these numbers 
must be multiplied by seven. The calculation of these differences and of upwards 
of 150 correlation coefficients has meant very strenuous labour. It must, indeed, be 
admitted that the application of the variate difference correlation method is not, 
even with small populations, a light task, but the change from the high positive 
to low negative and then to high negative values of the correlation is of 
extraordinary interest, and indicates the stages by which the associations are 
freed from the spurious influence of the time-factor. 


(8) All our correlations are given in Table III (p. 497), but it is desirable to 
discuss in detail certain groups of them. We take first the correlations of the 
deathrates in successive years. They are: 


Male. Female. 
o— + 398 + 080 +390 + 081 
T mg mg + 859 + 025 + ‘864 + 024 
ine + ‘924 + 014 +928 + 013 
ae +911 + 016 +917 + 015 


All these are positive, all are significant and, the first excepted, are very high 
correlations. There is no significant difference between male and female. The 
least important is the relation between deaths in infancy and deaths in the first 
year of childhood. We have in these correlation coefficients the numerical 
expression of what is obvious in Tables I and II, ie. as the deathrate in any year 
of age falls so does the deathrate of the same group in the following year. It is 
this fact which has led to the erroneous idea that natural selection plays no part 
inman. The fact, however, simply expresses the continuous change of environ- 
ment which has been in progress since 1860. During the half-century improved 
economic conditions, bettered sanitation, and developed medical care have lowered 
the deathrate at each age*. It is therefore impossible to deduce any argument 
as to natural selection in man from these correlations until we have removed this 
continuous influence of the time-factor. This is achieved by the variate difference 
correlation method. In every case a preliminary examination of Tables I and II 
shows that the correlation of the first differences of the deathrates of successive 
years is negative, and as we take higher and higher differences the intensity of this 
negative correlation increases, until with the sixth differences it reaches to the 


* As we have already remarked the infantile deathrate showed little of this improvement till 1905. 
It was about this same year that the absolute number of birtlis in England and Wales began to decline, 
so that while the population has increased by something like 34 millions, that population produces 
about 76,000 fewer babies annually. 


63—2 

















494 Further Evidence of Natural Selection in Man 


very substantial value of about —‘7. In other words a rise in the deathrate of 
one year of life means a fall in the deathrate of the following year of a most 
marked kind. While with the sixth differences we are approaching fairly closely 
steady values it may be doubted whether we have reached them in any case but 
that of 54mg. d9m;° Lhe following are the sixth difference correlations in the case 


of the deathrates of successive years : 


Male. Female. 

Fics tems — 688 + 090 — 719 + 081 
ne — 673 + 092 — 660 + 095 
ag — 703 + ‘085 — 731 + 078 
Thame. Bams — 695 + 087 — 736 + 077 


Again the male and female results are in excellent agreement, and we grasp 
the startling manner in which the new method reverses a judgment based on 
relations which have been deduced without any regard to secular change. 


(4) The question naturally arises: How far are these the “steady” values of 
the difference correlations measuring the organic relation apart from the time- 
factor of the deathrates in different years of infancy and childhood ? 


There are three fundamental tests: (i) The correlation coefficients of suc- 
cessive differences should have ceased to be markedly rising or falling. Table III 
(p. 497) shows that this is approximately but not absolutely the case, but we have 
reached a stage in which any further changes are certainly of the order of the 
probable errors and thus of little significance. The unsteadiness as will be in- 
dicated later in better tests is greatest in the differences of the deathrates in the 
first and second years of life. Here the correlations were taken to the seventh 
and eighth differences and gave: 


Male. Female. 
"'5.my . dma — 696 + 090 — ‘729 + ‘082 
sim, . dgme — 692 + 094 — ‘731 + 084 


which appear to have reached practical steadiness. Actually the final correlations 
must be somewhat greater than those obtained from the sixth differences. To 
push the process further, however, would be of small advantage because higher 
differences involve introducing earlier data, and the birthrate data before 1855 
become more and more unreliable. Again in the extremely high differences, the 
additional year required for an additional difference if not appertaining to rela- 
tively smooth data may in itself, when we have only a small total frequency of 50, 
produce a certain amount of unsteadiness. 


(ii) We may consider the mean values of the differences. 











Erne, M. EvpertTon AND Kari PERARSON 495 


If our first variable be taken* as # = ¢,(t) + X, where X is the intrinsic value 
of x as apart from the time change, then mean 6,,, after steadiness has set in is 


* One of the bases of the variate difference correlation method lies in the assumption that the 
intrinsic variation is superposed on a secular change of a continuous character; the causes which 
determined the intrinsic variation X are supposed to be sensibly independent of the time for the 
period under consideration. We conceive the secular change as given by a parabola, say, of the 
sth order, but the deviations from this curve are supposed in magnitude and sense to be independent 
of the time, i.e. due to chance causes which are the same in 1850 as in 1900. This assumption 
is an important one and must lead to our seeking relatively short periods consistent with a numerical 
frequency sufficient for significance. It can be roughly tested, of course, by considering ox as found 
from, say, the first and second halves of our observations. In our own case we found: 

Values of ox deduced from Sixth Differences for 1st 25, for 2nd 25, 
and for all 50 years. 


(m) (mg) | (ms) | (m4) (ms) 
hctncml 





613) 6.4 2 ee) SA eee ee 





| Ist 25 years ... | 7°32 | 6-94 | 5-51 | 5-61 | 2-09 | 2-30 | 1:52 1-67 | 1-05 | 0-91 

| All 50 years ... | 8°61 | 7°83 | 4-71 | 4°63 | 1:59 | 1-77 | 1-17 | 1:28 086 | 0-78 

| 2nd 25 years ... | 9°70 | 8-61 | 3-73 | 3-37 | 0°83 | 0-98 | 0-66 0°68 0°63 | 0-62 | 
| | | 








These values are less steady than we had originally hoped for. Clearly the variability of the X 
portion of the intantile deathrate has grown greater, and that of the four child deathrates has grown 
sensibly smaller with the time. The fundamental hypothesis of the variate difference method is there- 
fore only approximately true for this material. We have made some investigations on the assumption 
that x= ¢,(t) + (a+ bt) X, but the values of a and b obtained were by no means satisfactory. We have 
in hand a furtker investigation of the problem by the method, originally suggested by one of us, before 
the difference method was started; namely to subtract from 2 the value obtained by the best fitting 
parabola of the sth order in the time and so to reach the actual values of X. The relation of these 
to the time can then be found with some degree of accuracy. To the male deathrates of the second and 
fourth years of life we applied parabolae of the third order in the time, and obtained excellent fits ; we 
then subtracted the ordinates of these parabolae from the deathrates and correlated the remainders, 
dy and dy say. We found "dedy= +°312+ -088, a value corresponding more nearly with TSgmy dgmy than 


T Samy Bymy? and indicating that we might more rapidly approach finai values by this method than by 
that of variate differences. But the fitting of high order parabolae is very laborious; at the same time 
the graphs give excellent tests of the accuracy of the work, and we obtain the actual values of what 
we have termed X and Y, as represented by dz and dy. We then correlated the numerical value of ds 
with the time and found ry ,= —*284+-089. It is clear:that with correlations of this order with the 


time, rg q. would not be modified by the extent of its probable error if we found the partial corre- 
aM, 
lation ,rg q_: or corrected the correlation of d, and d, for the time. There is another point, however, 
24 


which justifies us in disregarding this variation of X and Y with the time as of secondary importance, 
The correlation of X with the time is positive in the first year’s mortality and negative in the following 
four years; thus while it would certainly tend to give a negative value to ry, for the 1st and 2nd 
years of life, it would tend to give a positive value to the correlation for all successive pairs of years 
beyond the lst and 2nd. Now all such successive pairs of years have high negative values, which are 
therefore minimum values, but these values are all in excellent agreement—roughly equal to —-7—with 
that found for the 1st and 2nd years of life. We therefore concluded that the influence of the time on 
the deviations from the secular curve of change, although very sensible, is of no substantial importance 


for he correlations, 




















496 Further Evidence of Natural Selection in Man 


equal to mean 6,,,X, and this (taking, as we have done, ‘ backward’ differences) 
is given (the C’s being the usual binomial coefficients) by 
— (X — 0, X_, +0, X_. - ...) + (Xn — OC, Xn t+ Ce Xn...) 
n , 
Now if we remember that the X’s have chance values uncorrelated with each 
other then we shall have for the squared standard deviation of the mean $,.,,X, 
- 2o7 (1 + 07 + C2 4+... +,C,7) 


; 
mean 6,.,;X ~ 


> 





Or, the probable error of the mean (r + 1)th difference after the steady values 
have been reached 
/ 


= “67449 A/ 2 





At first sight this appears of no value, because oy is unknown, but Dr Anderson 
has given o;__, in terms of ox when steady values have been reached *, Le. 
T+1 . 


; 2(r+1) : 
r+ OrtaX r+1 r+ 1 * 


From this we deduce the probable error of a mean rth difference to be 


when we assume steadiness reached, 


The values of the means of the differences with their probable errors on the 
assumption of steadiness are given in Table IV, and the ratio of the means to 
their differences in Table V. 


It will be seen that the positive and negative signs are not scattered quite as 
much at random as we might have hoped and that this is especially the case in 
the infantile mortality differencest. If we take all the ratios of the means to 
their probable errors except the first difference, we find their average value 116; 
it should be of course 1°18. Of these ratios 33 are positive and 25 negative. If we 
omit the ratios for the first year of life, we find 24 negative and 20 positive, while 
the mean value = ‘98 as against 1:18, the theoretical ratio of the mean to its 
probable error. It is obvious that the infantile mortality differences are those 
which are anomalous, Otherwise the mean differences vary fairly satisfactorily 


* Biometrika, Vol. x. p. 272. 
+ It may be noted that at the beginning of the period we have the disturbing influence of war and 
at the end of the period wholly changed conditions due to a great limitation of births. The means 
depend on differences of mortality under these conditions. 





A 
i) 
v2) 
== 
< 
a 
2 
=< 
nd 
=) 
A 
“4 
A 
i) 
ae 
e= 
| 
a 
3 
—_ 
_ 
= 
=< 
= 
=| 
=a 
ial 
— 


| LLO- F9EL: — 

| $20. F IPL: — 
G80: + €89- 

| 860. + L9¢- 
LZ1- F LOZ: 
9LL- FE10- 
ClO. + LI6- 


| GPL + S68. + 
| LPL. FOE. + 

6E1- FOE: 
| SE1- #00. + 
| LEL- + P60. + 
| GLI. + LOL — 
| 120. # 188: + 


fel 


180. ¥ 969- 

[80- + GOL- 

+60. F469. — 
ILL. #O8F- — 
EEL. F GLE. — | 
9IL- +680. + 
910. + L16. + 


GPL. + L68- + 
SEL. FI8e-. + 
981. + GEE. + 
LEL- FOLZ. + 
GEL. + P90. + 
GLI. $860: — 
ZO. $Z98- + 





LOT- FZLO. + 
O9T- # Z80- + 
| GGL. #80. + 
OPI. #690. + 
O€l- F6ZI-+ 
OIL FZFO- - 
| GEO. F ELE. + 


| OPL- + 16z- 

CEL. F FSS. — 
| S61. F88T-— 
9LI- + 00. — 
C80. + Sze. + 


é 


S9L- + OE0- — 
I9L- #E10- — | 
G1. FZZO- — 
FPL +800: — | 
ZEL- FELO- — | 
FLL. EPI. —| 
960. + 88L-+| 


089. F I8I- — 
PSL. FOILS —| 
€S1- F881. —|} 
OFI- $G9L- - 

I€1- #980. — 
911. #910. + 
ZLO- + F6P- + 


? 





(Su) reax WILT 


810. $ 1EL- — 
160. + 6¢9- — 
680. -O0G9. — 
860. + 89¢- — 
FIL #ZLE- — 
CLI. + P60: — 
€10- + 8Z6. + 


PRI-FLLE. + 
CFI. + 9PE- + 
6EL- + POE. + 


CEl. + LSS. + | 


LOL F961. + 
OIL. FLLO- + 
0G0- + PFS. + 


89T- + gO. + 
LOL + €90.- 
| GQ. + 180- 
| GPL. #8I1- 
O€I- + 861. + 


o 
+ 
a 


(Fu) reax YANO T 


+€0L- — 
+999. — 
F819. - 
+ 629. — 
+918: — 
+ €F0- — 
+ P26. + 


¢80- 
060- 
60. 
POL. 
611+ 
9LT- 
PIO. 





Fee. +| 
+618. + 
+Z8Z- + 

+612. + 
+ OFI- + 
— wed 


6PI- 
Cri. 
LPI- 
Lél- 
O€l- 
9LT- 
O€0- + 


I9L- #90Z-+ 
CCl. 0G. + 
E91. F9LZ. + 
9E1- #8ES- + 
9Z1- FCI. + 
E11. F991. + 
LLO. + 98F- + 


P 





C60. + 099: — | 
C60. + LP9- — 
960. + F19- 
860. + L9G. — 
L160. + G1¢-. — 
LOL- +098. — 


6 
160. + 099. 


F EL9- 


>| 160- + 9€9- 


P60. + L8G. 
860. + 90¢- 


POL- 


+ O€E- 


P20. + F98- +| CZO- + 6G8- 


I9T- ¥00% +| 6¢1- 
PG. F LIZ 


eb FOLL. + 
Cl- #6PF1- + 


€cl 


1 +601. +| IFI- 
GEL. F290. +| LET. + T80- 


9LL- #ZF0- — 


OIL. + 
O80. + FOF. +| 080- + 


+ LZZ- 


ak 
ae 
- F261. + 
+1G1-+ 

ole 


+ 800- — 
LOF- + 





| 
| 
| 


P80- + 1€ 
Z80- + 6ZL 


L80- + 61L- 
| [80- + GOL- 
180- + 689- 
080- + L99- 


980- + 629. — 
Z80- + Ge. 


L80- + 068- 





L. —| $60- ¥ 269. - 
- | 060: ¥ 969. 
— | 060- + 889- 


—| 180- ¥6L9 


—| 980- + £99- 
—| $80. + SP9. 
€80: + 


a be Fe 


+!| 080. F 96e- 





(fu) 1eeX pay, 








(%u) 1vaxX puosag 





° 
¥ 


| (Im) eax 4Sa1q 


Ug 
w-Q 
wig 
m9 
mtg 
wi 
meg 
ute 


A pars (Few) ava xX YyQanog 


Sue) ava 


( 


IVI puloseg 


(Suc) 


u 


Ni 


(tw) weaxX 4S 





‘saynyywagy fo saouasafiugy pun saynuyznagy fo suorn7a1.op 


‘TI 


ATAV EL 








a 
S 
= 
Ss 
3 
sS 
L 
DQ 
~~ 
8 
nS 
= 
~ 
> 
> 
> 
~~ 
~ 
~ 
we 
3 
S 
S 
4 
~ 
~ 
7 
yy 


| 188. FEE + 
| S61. + 991-+ 
| L€0- + 990. — 
| BG0- $ LG0- — 
| 120. ¥ &26. 


CE. FOL LL 


(Sw) ¢—F 


19G- + LEZ. 
LEL- FELL-+ 


990- + 1ZO- +| FL0- * 8ZO. + 


FPO: + 090- — 
L@O- + ZGO- — 


— | $G0- + 902. - 


9€- + LETT 





2 BOX 


‘SOLA O1QDQ04T Way) 02 saouasafiug fo sunayy fo oywmyy “A ATAVIL 


‘oT 8Ules aq} MOT[OJ S1apio IIySty jo SooUa.teAtp oq} pue Su 


Sau | ae 
pn | peel 

L8E- + OOL- +| GE. + Ges. + 
bOG-. + ZEI-+| OST. F LEL-+ 
OLI-+ LOT-—| LOL- # FOL-— 
090. + G80. —| LGO- + F80- — 
9€0- + G00. + | CEO. + FTO. + 
6Z0- + ShZ- — | 6ZO- + GZS. — 


Lb SPSL | Lr FOP 


(ta) F—g :1Bdaq 


(Fu) F—E :1B9OK (fu) e—z% 


ssc G6-1+ 
60-6 + 
€L-E+ 
9L-E+ 


OF-G+ 86-1 
thy + 96-Z 
€0-T — 9L-1 
LP-I- F0- 
OF. + 09-1 
69-4— 6-01 


2IVIK eax 


— LOV-GF ELE-L+ 
Pe SF6L1- 
Lav- 1 ¥ GSE- 
+ CFI- 
+ L10- 
+ CE0- 
+ 990- 
+ Z69.- 


OFE-G + GEL. + 
GELS F CBS. — 
| FOF-L + €L%- — 
FEL. +9F0.— 


CEG. + ILL — | GSP. F ECE. - 
L8Z- + Z8P- — | 8. + LES. 

OGI- FELL. —| GEL LES-—| LL. FHLL+ 
€80- + L€0- + | 910. + £00-—| 661. FOLT-+ 
6F0- + SOL. + G0. + GL0- +| OLT- + LLO- - 
CE0- + OE. — | SEO. F 19. —| FLO. FOG. — 


18. 
GO0Z- 
ITI. 
PLO- 


C9. #€9-EZ | €9-F10-FZ | 98-FFL-1E - FCL.c¢ 





(Su) g¢—g Parvaz uk) Z—T 2 1v9OR 


“SHOLMG a7QDQ04q 4004} pup sUuUDAyyT 


(tm) [—Q :avox 


wg 
ua 
w% 
wg 
uk 
weg pag 
weg puz 
wig ‘souslegly 4S] 


438 
Wy 
449 
qI¢ 
Wt 





OPVIIVA 


—'*%y=wig arvok ys 043 10g 





9LL-8 + 8E8-91 + 


6FC-F + LOF-6 
ELE-Z + 9CO-E 
6&Z- | + 089-2 
cg. +1861 
She. + 
C6L- 

€él- 


‘AI WTEavViL 


FOL-6 + 029-61 + 
+) 610-9 +€€8-0L + 
+| 909-+819-¢ + 
+} LOS-L#¢EZ8-6 + 
+| ZLL. FEFE-L + 
+| ogg. +l6h- + 
+|€1%. +620. 

—| Chl. +€6L- 


90 





uw ‘ayeryyeeg [enjoy 


O7BIIBA 














Erne, M. Evperton axnp Kari PEARSON 499 


round zero in the required manner. The interest of this test is that we see that 
the bulk of the time effect has been removed even when we reach the second 
difference, a result confirmed by the fact that the correlation of the deathrates’ 
second differences is in every case already substantially negative. 


(iii) A third set of tests are those which are based on the standard deviations 
of the differences. In the first place if we assume steadiness to have set in, we 
can calculate ox, the intrinsic standard deviation from the known value of 52 by 
means of Dr Anderson’s formula cited above (p. 496). Table VI gives the intrinsic 
values of ox, Le. ox as deduced from the variability of the differences. It will be 
at once observed that for the third difference the mortality ratios of the third, 
fourth and fifth years of life reach steady standard deviations. In the case of the 
first year of life it is not till the eighth difference that this result is reached, while 
in the case of the second year, it can hardly be said to have been obtained with 
the ninth difference. A distinction should be noted here of which the exact 
physical significance is not obvious to us. In the second, third and fourth years 
the intrinsic standard deviations fall to steady values, but in the first and second 
years they rise towards those values and these are just the cases where steady 
values are not absolutely reached. 


TABLE VI. 


Intrinsic Standard Deviations (ox). 


| 
| | Year: O—1 (m,) | Year: 1—2 (mg) | Year: 2—3 (ms) | Year: 3—4 (m4) | Year: 4—5 (ms) 
| Order of 











Difference sy TE acm ES Ke | ao tn ae 
a eat Ee ) ? ) AS Vote a ee 3 ? 
| } 
| Za eta ae i en ro 
Ist 7°62 | 6:96 | 3:90 | 3°86 | 1°75 | 1:83 | 1°53 | 152 | 1-23 | 1-09 
| 2nd 7°89 | 7:22 | 4:13 | 4:09 | 167 | 1°81 | 1:29 | 1:34 | 1:00 "83 
| 38rd 814 | 7°45 | 4°32 | 4°28 | 1°62 | 1°78 | 1-21 | 1°29 ‘93 80 
| 4th 8°34 | 7°63 | 4:47 | 4°42 | 159 | 1°76 | 1°18 | 1-29 ‘87 77 
| Sth 8°50 | 7°76 | 4°60 | 4°54 | 1°58 | 1°76 | 1°17 | 1:28 “86 77 
| 6th | 861 | 783 | 4°71 | 4°63 | 1°59 | 1°77 | 1:17 | 1°28 “86 78 
7th 8°66 | 7°84 | 4°80 | 4°72 -- — — — _ — 
8th 8°68 | 7°85 | 4:88 | 4°78 até _ = a = oy 
9th — — 4:97 | 4°82 — — — — — _ 
| 





(iv) There is another test for the standard deviations of the differences 
deduced by Cave and Pearson from the Andersonian results and used by them in 
their memoir on Italian Index Values*, namely as steadiness is approached the 
ratio of the squares of standard deviation of successive differences should approach 
closer and closer to 4, the exact value being 

o'S,m oa 2 
o'5,_,m § 
* Biometrika, Vol, x. p. 346. 


Biometrika x 





500 


Further Evidence of Natural Selection in Man 


Table VII shows how rapidly the system approximates to the theoretical values 
in the case of the higher differences. 


On the basis of all the tests we have applied we may, we think, conclude that 
by the sixth difference we have reached values for the correlation of deathrates 
in successive years which are in all probability close to the organic or intrinsic 
values. Only in the first and second years of life is steadiness not absolutely 
reached, but for practical purposes but little change can be anticipated in the 


correlation coefficients. 


Ratio of Squared Standard Deviations. 











} my My 
a a fom 
| g | ¢ ee 
i et Fe ut = 
1 | — “956 "354 °369 
2 3°199 | 3°221 | 3°374 | 3°384 
8 | 3°547 | 3°552.| 3°638 | 3°633 
4 | 3°676 | 3°673 | 3°754 | 3°741 
5 |3°738 3°723 3°811 | 3°793 
6 |3°756 3°734 | 3°848 | 3°828 


| b | 


(5) We can look at the association of deathrates in successive 


another standpoint. 





| 


| 
| 


TABLE VII. 








"142 
2°731 


"144 
2°934 
3°133 | 3°240 
3°363 | 3°428 
3°591 | 3°604 
3°690 | 3°683 


| 











m4 | ms Mean | Mean | Theory 
| | | | 
ee Dib: 
é es es Rot ag | 2 
“194 a9 | 211 aaa ‘370| °368| 2 
2°127 | 2°317 | 1-996 | 1-728 | 2°685| 2-717} 3 
2-944 3-086 | 2-896 | 3-109 | 3-232 | 3:324| 3-333 
3°341 | 3°504 | 3-054 | 3-227 | 3°438 | 3°515| 3-500 
3°509 3°562 | 3-504 | 3-641 | 3°631 | 3°665| 3-600 
3°708 | 3°648 | 3-691 | 3°758 | 3°739 | 3°730| 3°667 





years from 


We can ask if there be an increase of 10 points in the 


deathrate for a given year, what increase or decrease will there be of deathrate 
in the same group in the following year ? 


In Table VIII below the second column gives the spurious change which is 


apparent in the crude data, the third column gives the real organic change which 
is discovered when the time-factor is removed. 


TABLE 


Vil 


I. 


Association of Deathrates without and with Annulment of Time-factor. 


Result of an increase of 10 deaths per mille in one year of life on the deaths per 


mille in the next year. 





Increase of 10 in Deathrate of 


Ist Year on 2nd Year 
2nd Year on 3rd Year 
3rd Year on 4th Year 
4th Year on 5th Year 








Disregarding Time-factor 


) 


Increase 3°3 
Increase 6°1 
Increase 6°9 
Increase 7°0 


Increase 3°8 


re) 
+ 


| 
| 


| De 
Increase 6°6 De 
Increase 6°7 De 
Increase 6°8 | De 


Annulling Time-factor 


crease 
crease 
crease 
crease 


3° 
? 


Crore 


>) 


Decrease 4°3 
Decrease 2°5 
Decrease 5°3 
Decrease 4°5 























Erne, M. Evperton anp Kari PEARSON 501 


It is easy to see how those who contented themselves with crude deathrates, 
making no allowance for the betterment of deathrates with the time, interpreted 
a higher deathrate in one year to mean a higher deathrate in the next year of life, 
and so questioned whether natural selection applied to civilised man. As a 
matter of fact we see that the true organic relationship of deathrates is much 
more probably summed up in the statement that a decrease or an increase of 
deathrate in one year of infancy or childhood is in each case followed by an 
increase or a decrease in the deathrate of the survivors of the same group in the 
following year. Disregarding the time-factor we have a result quite incompatible 
with natural selection; annulling the time-factor, we have a result not only 
compatible with natural selection, but very difficult of any other interpretation 
than that of a selective deathrate, i.e. a heavy mortality means a selection of the 
weaker members, and the exposure to risk in the following year of a selected 
or stronger population, which has accordingly a lesser deathrate. 


(6) We now turn to the problem of how far this influence extends, or 
probably it would be better to phrase it: how far this influence can be traced. 
It is not only that the age group we follow does not absolutely consist of the 
same individuals but even with those members that are the same there is very 
often change of environment due not to time but to a change of locality or 
of economic condition affecting individuals. Added to this there is a continuous 
immigration and emigration. But beyond these causes weakening the association, 
there is another difficulty of great importance arising from what has happened 
in the intervening years. We wish to find out how an increase of deathrate 
in the sth year of life affects the deathrate in the (s+2)th year of life, but 
the events in the (s+1)th year will largely dominate and, perhaps, screen the 
results we are seeking. Such problems are always arising in statistical research. 
For example, a child may resemble its grandfather simply because both grand- 
father and child are like the child’s father. We know that the problem is 
answered statistically by inquiring what is the relation between a character in 
the child and the grandparent for a constant value of the character in the parent. 
In precisely the same manner we must in the present problem inquire: What 
is the correlation between the deathrates in the sth and (s+2)th year of life 
for constant deathrate in the (s+1)th year of life ? 


TABLE IX. 


Influence of Natural Selection at Interval of Two Years. 








| Partial Correlation of For constant 3 Q 
ee! = ra 
| 
dom, and dgmz—si««. dgirty — *4307 — *5242 
Sem, and dSgmy dgimg — °2555 — *2058 


dgmz and Sym; .. 1798 — °3129 











502 Further Evidence of Natural Selection in Man 


We shall of course work with the sixth difference correlations in order to free 
ourselves substantially from the time-factor. 

Here again the judgment based on the partial correlation of the crude 
deathrates is in all six cases reversed. For every one of the partial coefficients 
of crude deathrates shows that for intervening year with a constant deathrate, 
an increase of deathrate in the earlier year means an increase, not a decrease in 
the later year. Actually an increase in the one year is shown in Table X in all 
cases to be followed by a decrease at two years’ interval. 


TABLE X. 
Influence of Natural Selection at Interval of Two Years. 


Result of an increase of 10 deaths per mille in the second following year. 


For constant death- 


| Increase of 10 in Deathrate of sate in da | 1 

} c Mey ! 
lst Year on that of 3rd Year | 2nd Year | Decrease °81 Decrease 1°28 | 
2nd Year on that of 4th Year 3rd Year Decrease *61 Decrease °*52 | 

3rd Year on that of 5th Year 4th Year Decrease ‘99 Decrease 1°4 


It will be seen that these values are appreciable although far less important 
than the decreases produced in a following year by an increase in the immediately 
preceding year. Thus we judge that a selection of the weakly children in one 
year is largely influential on the deathrate of the immediately following year, and 
diminishes, as we might anticipate, with increase of time. 

Some objection might, however, be taken to the sixth difference correlations, 
when we consider deathrates of the same group two years apart. They are 


Male. Female. 
Via: Sette +227 +159 +°200 +161 
re +°339 + 149 +377 +°144 
Pic eee +397 +142 +398 + 142 


It will be seen that while they are all of the same sign and fairly accordant for 
both sexes the probable errors are becoming very substantial relative to the 
coefficients. We have indeed too limited a range of years. 

(7) If now we take out the correlation coefficients of the sixth differences for 
three years’ interval, and again for four years’ interval we find great irregularities. 








Male. Female. 
as: +°205 +161 + 035 + ‘168 
Pee ta — 030 +168 +072 +167 
— ‘181 + ‘163 — ‘251 +°'158 


9 5gmy, . 5g ms 





The correlations now do not agree in sign, they are insignificant having regard 
to their probable errors, and there is no close correspondence for the two sexes. 











EraeL. M. ELpERTON AND Kari PEARSON 503 


We should need a far longer period than 50 years to determine certainly even the 
signs of these correlations, and their real magnitudes would require still ampler 
data, It would appear impossible” to assert on the basis of the above values of 
the correlations at three and four years’ intervals more than the insignificance 
of the associations between deathrates of the same groups at intervals of more 
than two years*. In other words the effect of intense selection appears to be 
exhausted after an interval.of two years. The word “appears” is used purposely 
because there must be some spurious weakening of the effect due to our not being 
able to follow absolutely the same individuals. 

(8) We have further studied to some extent the relationship between the 
male and female deathrates. There is almost perfect correlation between male 
and female deathrates in any given year of life after we annul the time-factor. 
Thus, if we represent female deathrates by m’, we have as illustrations: 

Samy - dgmy’ the 9905, 
Smo. dgms’ — + °9880, 
5gms! = + 9687, 


-=+ "9800. 


TSgmg 
Sqm, . dgmy 
Of course the sole significance of these values lies in the fact that years of 
stress, whether due to climatic or epidemic causes, affect equally infants or 
children of both sexes of the same age. But these very high values in our 
opinion cast considerable doubt on the partial correlations derived from them. 
We have in fact 
Tye — Tis 23 N 
rn = a 
vl — 13°) (1 — rz*) D 
and if we suppose 7. and 7; nearly equal, then if 7; be of the above high value 
N will be extremely small, but D is also, owing to the presence of the factor 
V¥1—r,,2, very small. Thus ,7,. although it may be very considerable is the ratio 


* Actually the partial correlations of the sixth differences at three years’ interval based on the above 
values are : 





Correlation of For constant 3 2 | 
. ae 

dgm, and dgm, deme and dgmz +°526 +181 | 
+°485 | 


dgmo and dgms dgms and dgmg +°251 | 





These are certainly all positive, but they are irregular as between the sexes and probably quite 
unreliable for the reasons already given. Should a more extended experience show that there is a 
real if slight positive correlation between deathrates at three years’ interval, while there is con- 
siderable negative correlation at one and two years’ intervals, we should be compelled to discuss 
whether there may not be something periodic in the nature of the heavy and light deathrates of 
infancy and childhood. . We have been unable to trace any ‘sign’ of such periodicity either in the 
deathrates or in the graphs drawn, but we do not believe that a very-short periodicity would be elimi- 
nated by the variate difference method using any.moderate number of differences, We cannot on 
this point accept Dr Anderson’s view. See Biometrika, Vol. x. p. 279. 











504 Further Evidence of Natural Selection in Man 


of two small quantities and any disturbing cause which but slightly modifies the 
value of either 7. or 7, may even change the sign of NV and so swing ;r,. over from 
a considerable positive to a considerable negative value*. 

We can consider the correlations between the female deathrate in one year and 
the male deathrate in a second year, supposing of course time influence annulled. 
We have 

T3qm, .dgme’ ~~ 6674 ("55m gm, ~ 6879), 
TSgmy’.dgmg ~~ ‘7337 (Ts,my’. 5gma! ~ 7188), 
Vases ate, ‘7313 (755m i<£.- 7032), 
"%5 m3.dgm, — — ‘7278 (75, mg’. dgm4 — 7313). 


Thus we see that the same remarkably high negative correlations exist between 
the male and female deathrates of successive years of groups born in the same year 
as exist between male and male or female and female deathrates within the same 
group in successive years. In fact in two out of the four correlations the cross 
relationships are higher than the direct, although the differences are scarcely 
significant. Here again there is nothing noteworthy, considering the very high 
correlations just noted to exist between the male and female deathrates of groups 
born in the same year. We can, however, endeavour to correct such values by 
finding the relationship between the deathrate in females in the first year of life 
and males born in the same year in their second year of life for a constant death- 
rate of males in the first year of life. Or still more stringently between the 
deathrates of females in the first year of life with males in the second year of life 
for constant male deathrate in the first year of life and constant female deathrate 
in the second year of life. We should anticipate that such values would come 
out small or insignificant, if our interpretation of the high negative correlations 
between deathrates of the same group in successive years of life be a correct one, 
ie. that the high deathrate leaves a stronger population. For a ‘eavy deathrate 
in the females of one year should not leave a stronger population of males for the 
following year after correction by partial correlation. 

We obtained the following correlations : 

5pm! Sgmy’ . gm: — — ‘5240 + 0692, 
85m’ gm, . gma’ = + ‘4665 + ‘0746. 


* The reader must note that we say a ‘“‘disturbing cause”; it is not the mere result of random 
sampling affecting N. The probable error of N=1rj2— 13723 for a sample of size n is given by 


“674490 y= "67449 + { D2 — N2[2 (1 - 1432) + 2 (1 — 1292) +1 - rg? - 8]}?, 
Nv 


and is thus quite easy to calculate. We have tested it on a number of cases of partial correlations 
worked out for this paper and find that if -67449cy is of the same order as N, then -67449c,,,, is 
of much the same order as grj2. In other words, if N is so small relative to its probable error that 
it might easily have a reversed sign, then grjz is insignificant as compared to its probable érror also. 
For example, N=-0446 and D=-0956 leads to grj2=*4665 with a probable error of 0746. rio is 
accordingly considerable and significant, but the probable error of N is only ‘0105, and we can hardly 
suppose the sign of-3rj2 due to a random sampling variation in the sign of N. 














ETHEL M. Evprerton anp Kari. PEARSON 505 


These values were so startling and so contradictory, that we proceeded to 

eighth differences with the results: 

dgmy "5, m,’. 53m, — ‘6013 x ‘0609 

5g my/ 75g my. 5gmo’ = + 5481 + °0667, 
which emphasised as well as confirmed the previous results. 

Now it seems absurd to suppose that the deaths of female infants in one year 
can organically influence the deaths of males of the same group in the next year, 
or male infants the deaths of females in the successive year. But the extraordinary 
feature of these results is that while a high deathrate of female infants lessens the 
deathrate of males in the second year of life of the same group, a high deathrate 
of male infants increases the deathrate of females in the second year of life of the 
same group. 

In order to throw further light on the matter we investigated male and female 
deathrate correlations in the third and fourth years of life. We found 

5gms! 3g ms! . dgmy — 2640 t 0887, 
— 0082 + ‘0954. 


gma’ !'5g mg. 5gmy’ — 

The second is practically zero, the first of no importance having regard to the 
high values of the correlation of deathrates of groups of the same sex in the third 
and fourth years of life (f:—°703 + 085; 9:—°731 +4078). Had we come to 
these values at first we should have been content, but the cross relation between 
the infant deaths of one sex and the deaths in the second year of life of the 
opposite sex was undoubtedly puzzling. 

We then proceeded to still further limit our conditions by determining the 
partial correlation between female infants in one year and males in the second 
year of life of the same birth-year when the deathrates of the males in the first 
year of life and of the females in the second were both constant. We obtained 
= + 1632 + 0928, 
= + ‘2997 + 0868. 


5g my . 5g me’ "55m . 5g me 

5g my’ . 5g me "Ss My . 5g my’ 

Having regard to their probable errors these are of a quite different and 
negligible significance when compared with the values of 


5pm! dg my’ . Some and dem’? 5g my. Sgmo" 


given above. 
It is worth while noting that 
Gg mg! Sg my’ . 8gm_ = — ‘2188 + 0908, 
»= +1088 + 0943 


5g Mz Sqm, - dgme 
also give values of no practical importance. Or, to annul the spurious influence 
of infantile deaths of one sex, A, on deaths in the second year of sex, B, of the 
same group, it is more effective to render constant the deaths of A in the second 
year of life than of B in the first year of life. 











506 Further Evidence of Natural Selection in Man 


In the light of this result we have found the correlations between deathrates 
of sex A in the third and sex B in the fourth year of life, for constant deathrate 
of sex A in the fourth year of life. 

We have 

Jom’ Bgms! . gm, = — 0818 + 0948, 
= — ‘1477 + 0933. 


dg™4 "Ss mz . dgmy’ 
Both of these may be taken as zero, having regard to their probable errors. 
Thus on the whole, while the relation between the deathrate of a group of one 
sex in one year and the deathrate of the remainder in the following year of life 
appears after the annulment of the time-factor to be very considerable and 
negative, there does not appear to be any organic relation between the deathrate 
of sex A in one year and sex B in the following year, if we proceed by the method 
of partial correlation. But at the same time we believe that this method must 
be used with very considerable caution, and that to avoid erroneous conclusions the 
whole problem must be investigated from a variety of standpoints in cases like the 
present where one of the three total correlations is extremely high. The numerator 
NV ranges in the cases we have been discussing from about ‘01 to ‘05 and with a 
small total frequency like 50, any disturbing cause 
may have marked influence*. 





apart from random variation— 


(9) The conclusion which we have formed is that in the present problem of 
natural selection it is probably better to annul the environmental factor by 
the variate difference method rather than to proceed by the method of partial 
correlation as we have hitherto done. 

By the former method we have shown that for both sexes a heavy deathrate 
in one year of life means a markedly lower deathrate in the same group in the 
following year of life, and that this extends in a lessened degree to the year 
following that, but is not by the present method easy to trace further. It is 
difficult to believe that this important fact can be due to any other source than 
the influence of natural selection, i.e. a heavy mortality leaves behind it a stronger 
population. Nature is not concerned with the moral or the immoral, which are 
standards of human conduct, and the duty of the naturalist is to point out what 
goes on in Nature. There can now scarcely be a doubt that even in highly 
organised human communities the deathrate is selective, and physical fitness is 
the criterion for survival. To assert the existence of this selection and measure 
its intensity must be distinguished from advocacy of a high infant mortality as 
a factor of racial efficiency. This reminder is the more needful as there are not 
wanting those who assert that demonstrating the existence of natural selection in 
man is identical with decrying all efforts to reduce the infantile deathrate. 

We have to acknowledge the great assistance we have received from our 
colleague Miss Beatrice M. Cave in the laborious arithmetical work of this paper. 


* If F=N/D, where N and D are both small, but F tinite, then 6F/F=6N/N-6D/D and small 
disturbances produce great results in F. 


























FREQUENCY DISTRIBUTION OF THE VALUES OF THE 
CORRELATION COEFFICIENT IN SAMPLES FROM 
AN INDEFINITELY LARGE POPULATION. 


By R, A. FISHER. 


1. My attention was drawn to the problem of the frequency distribution of the 
correlation coefficient by an article published by Mr H. E. Soper* in 1913. Seeing 
that the problem might be attacked by means of geometrical ideas, which I had 
previously found helpful in the consideration of samples, I have examined the two 
articles by “Student+,” upon which Mr Soper’s more elaborate work was based, 
with a view to checking and verifying the conclusions there attained. 

“Student,” if I do not mistake his intention, desiring primarily to obtain 
a just estimate of the accuracy to be ascribed to the mean of a small sample, 
found it necessary to allow for the fact that the mean square error of such a 
sample is not generally equal to the standard deviation of the normal population 
from which it is drawn. He was led, in fact, to study the frequency distribution 
of the mean square error. He calculated algebraically the first four moments of 
this frequency curve, both about the zero point, and about its mean, observed 
a simple law to connect the successive momenis, and discovered a frequency curve, 
which fitted his moments, and gave the required law. 

Thus if 2, #,... Z, are the members of a sample, 

NL =H, + Het... +My, 
and np? = (a, — @) + (a —ZP +... + (@,— 2), 
the frequency with which the mean square error lies in the range du is propor- 
tional to 


This result, although arrived at by empirical methods, was established almost 
beyond reasonable doubt in the first of “Student’s” papers. It is, however, of 
interest to notice that the form establishes itself instantly, when the distribution 
of the sample is viewed geometrically. 

* Biometrika, Vol. rx. p. 91. + Ibid. Vol. vi. pp. 1 and 302. 


Biometrika x 








508 Distribution of the Correlation Coefficients of Samples 


In the second of these two papers the more difficult problem of the frequency 
distribution of the correlation coefficient is attempted. For samples of 2 the 
frequency distribution between the only two possible values —1 and +1 was 
determined by Sheppard’s theorem to be in the ratio 5 tsin“p : 5 — Sinp, 
where p is the correlation of the population. Besides this theoretical result, 
“Student” appeals only to experimental data. From these he derives an 
empirical form for the distribution when p=0, and makes several valuable 
suggestions. It has been the greatest pleasure and interest to myself to observe 
with what accuracy “Student’s” insight has led him to the right conclusions. 
The form when p= 0 is absolutely correct, and as a further instance I may quote 
the remark* “I have dealt with the cases of samples of 2 at some length, because 
it is possible that this limiting value of the distribution, with its mean of 


2 Bee : : 
—sin-'p and its second moment coefficient of 1 — ( - sin'p) , may furnish a clue 
Tv 7 


9” 


to the distribution when n is greater than 2.” As a matter of fact it is just these 
quantities with which we shall be concerned. 


To Mr Soper’s laborious and intricate paper I cannot hope to do justice. 
I have been able to establish the substantial accuracy and value of his approxima- 
tions. It is one of the advantages of approaching a problem from opposite 
standpoints that Mr Soper’s forms are most accurate for those lar 
where the exact formulae become most complicated. 


reer values of n, 

2. The problem of the frequency distribution of the correlation coefficient r, 
derived from a sample of n pairs, taken at random from an infinite population, 
may be solved, when that population can be represented by a normal surface, 
with the aid of certain very general conceptions derived from the geometry of 
n dimensional space. In this paper the general form will first be demonstrated, 
and for a few important cases some of the successive moments will be derived. 
Incidentally it will be of interest to compare the exact form with Mr Soper’s 
approximation, and with reference to the experimental data supplied by “Student.” 


If the frequency distribution of the population be specified by the form 


1 _ 1 §(e-m)* _ 2p (e«-m) y —mMs) , (y- at 
df= , e 1-p?( 20,2 20102 eo? dada 
y Q7o,0, V1 — p” y 

where df is the chance that any observation should fall into the range dady, then 


the chance that n pairs should fall within their specified elements is 
1 S§(e-m)? 2p (x-m)) (y—me) , (y-my)?) 


1 “T-pPrTt I? Qo, 0: Qo2 § 
- e a — wt * da, dy, ... datyndyy...(1), 
(290, 0, V1 a p?)” a2, Hn W y L ( ) 


and this we interpret as a simple density distribution in 2n dimensions. 


* Biometrika, Vol, v1. p. 304, 






















R. A. FiIsHer 509 


} For the variables # and y it is now necessary to substitute the statistical 
derivatives determined by the equations 








n n 
n& = > (a), ny = = (y), 
+ 1 1 
n n 
My =Z(e-Z)P, ne ==U(y— yy), 
1 1 
n = 
NT Yn fa = & (x — Z)(y — 9), 
1 
and it is evident that the only difficulty lies in the expression of an element of 
| volume in 2n dimensional space in terms of these derivatives. 
The five quantities above defined have, in fact, an exceedingly beautiful 
interpretation in generalised space, which we may now examine. 

) 3. Considering first the space of n dimensions in which the variations of « 
are represented, the mean and mean square error of n observations are determined 
by the relations of P, the point representing the n observations, to the line 

BM = Xa = Xz =... = My; 

| for the perpendicular PM drawn from P upon this line will lie in the region 

\ M+ Met... + Xn = Ne, 
and will meet it at the point M, where 

4 ty=E, Mm=t, ... Wp, =B; 

further, since, PM? = (a, —%? + (a, -—ZP +... + (an—ZY, 
the length of PM is yy, \/n. 

Xs 
( 
Xy 
An element of volume in this n dimensional space may now without difficulty 
be specified in terms of % and y,; for, given % and w,, P must lie on a sphere in 
{ n—1 dimensions, lying at right angles to the line OM, and the element of 
s volume is 


Cu," dp, dz, 


where C is some constant, which need not be determined. 





510 Distribution of the Correlation Coefficients of Samples 


The point in 2n dimensional space which is represented by the n pairs of 
observations must be such that its projection on the n dimensional space, in 
which z is represented, lies upon a certain sphere of radius y, /n, and on the space 
in which y is represented, upon another sphere of radius p../n, and now, when we 
come to the interpretation of r, we must observe that to each point on the first 
sphere there corresponds a certain point on the second sphere, to which it bears 
the relation 

L—-B &%,—F _ &_—& 





"%-9 %-Y — Y-y 

In general this relation does not hold for the n pairs of observations, and the 
two projections will not fall at corresponding points on the two spheres. If now 
one of the spheres be turned round so as to occupy the same space as the other, 
and so that the lines upon which a, and y,, and the other pairs of coordinates, are 
measured, coincide, then corresponding points will lie on the same radii, and the 
correlation coefficient 7 measures the cosine of the angle between the radii to the 
two points specified by the observations. 


Taking one of the projections as fixed at any point on the sphere of radius po, 
the region for which r lies in the range dr, is a zone, on the other sphere in n — 1 
dimensions, of radius p,Vn V¥1—r%, and of width #, Vndr/V1—7r, and therefore 

n—4 
having a volume proportional to y,"-2(1—7°) 2 dr. 


4. We may now turn to the direct simplification of the expression (I), at each 
stage discaiding any factors which do not involve r. 


= an SK . SE- ae -*) . G- oP 


1-p?7l 26; 20109 Qo02 
e PLU Bay 20102 202 dx,dy,da,dy, ... dt,dyn 
may be reduced to 


nn §(z — my)? + my? _ 2p {rujbe +(E —m) (G —mz)} n (y¥- ma)” ca 


1p?) 20,” 20102 205? 
n—-4 
dE AY py"*d pry prs"? pg (1 — 1?) ? dr, 
__% Sm® _ 2prmine a?) n—-4 
or to e 1-P (20? dares Bod) 4 m-4y 9-2] — 72) . du,dp,dr. 


In order to integrate this expression from 0 to 0 , with respect to mw, and pp, let 


— #2 ef = nes 








O10,” faP, 
and we have 
D 0 -j = (cosh z- pr) 2 
iz dz | perdee IP a=" a, 
“ d n-4 
or | -— —— —1r) * dr, 
0 (Gust s- Pe : 











R. A. FisHer 511 


which, on substituting cos @ for — pr, may be expressed in terms of a Legendre 
function in the form 








° n—4 
(i cosec 0)" Qn_o(é cot @).(L—r) 2 dr o..eeeeeeeceeeeeeeees (II). 
35 dz 0 
Again | 9 cosh z +cos 0 sin 0’ 
dz 7; ee 
“7 | » Gosh s+ cos By ~ jn —2 (sin 250) sin 0” 


and since this is a function of pr only, we may express the frequency distribution 
by the convenient expression 


S aie é 
sate SF or"—2 (=, 3) - 


Professor Pearson has shown that this last result can be obtained directly 
from Sheppard’s theorem* that 
cos*(— R) | 


1 My? 2Ryype bs? 
: F ¥¢ --20- Re) (S3- S35 * ze du, dp, = — 
2rd, Xe V1—R? /0 Jo 1 —_ } 


making the substitutions 











1 n 
d-R)s! “L ia p*)o” 
1 n 
(1-R)z (= p*)oy” 
R nrp 
(1 — R*)>, >, ~ (1—p*)o,0, 
which give R=pr 
and cos (— R) = 8, 


we obtain 
e. r Me 
(4, PP Hy Ma +H 4) 


n 
[. [ve ~ 2(1— 2 p) ee 0102 3 
os - p*) 


and hence differentiating (n — 2) times with respect to 7, the required expression 
is obtained. 


0 


- 


5. The form which we have now obtained may be applied without difficulty 
to all small even values of », and in such cases is peculiarly suitable for the 
calculation of moments. 
When n=2 the ordinate of the curve, with abscissa 7, is 
= 
(1— 7°) sin 0’ 
which becomes hyperbolic in the neighbourhoods of —1 and +1. The value 
* Phil. Trans, Vol. 192, A, p. 141. 





° 


512 Distribution of the Correlation Coefficients of Samples 


of r is, therefore, as we know, either —1 or +1, and the proportion, in which 
these occur, depends upon p. The ratio of the infinite areas included with the 
asymptotes of the above curve is 


cos p 
cos (— p)’ ¢ 
- . sin p 
so that the mean value of a number of observations is —. 


When n=4 there is still no approach to normality, the curve takes the form 
1 9 

ant 9 (9 — 3 cot 8 + 34 cot? D, 

which, when 7 is positive, increases regularly from its value of 4 when @=0, to 

infinity, to which it approaches as @ approaches 7. Unless p is actually equal 

to 1, in which case r is also 1 of necessity, the curve has finite ordinates at both 

extremes. For calculating the number of values which should fall within any 


given range, the integral, <9 (l — @cot 4), may be directly tabulated, as has 


been done in forming the accompanying table of “Student’s” observations, and 
the corresponding expectations. The values given by Mr Soper’s formula are 
apposed for comparison. 


Table for comparison with p. 114, Biometrika, Vol. IX. 














F 
| Calculated | home ol P H.E.Soper’s| pie ° 
| r | frequency | Observed | DiGlerence = approxi- Ditlerence La 
| m | . m mation : m 
| | 
De eT F : i | : : ‘ | ae 
‘905—1 | 2021 | 1755 |) a. | a eee eee 
05—905| 1249 | 1365 |f ~150 | “69 | “ogg = | ¢.—172 ad 
‘705—805|  88°7. | 84 |) ‘ 72°1 ee ie 
o— | o1 | «6 I|f~ 28 09 ae } +203 3°18 
-505— 49°9 55 fers | 48-0 ) ; 
-405— | 37°8 | 45 | j +12°3 | 1°73 40°2 j +11°8 1°58 
305— | 30 2451) oo. 3 34°3 ; 
s5— | ms | a5 | ~ 84] 74 | 29°7 } iol Mia 
*105— 20°5 ¢ 5-R 
2c Be a } -116 | 358 | 338 } -21'6 | 9:80 
1-905— 14°5 ee che 18°8 ) =f 
1:805— 12°4 12 ptr 187 | mete: ™ 0°02 | 
1-705 10:7 i3 (|) 13500) r x 
| 1-605— 9°3 3 "ie 4:0 80 11°2 : Bee 8°7 3°06 
| 1°505— 8'1 12 \ < oe 9°0 |) me 
1-405 — a ig |p +127 | 10-54 eo. |f +12°1 9°21 | 
1:305— 6°3 7 ee 51) = 
1-205— | 56 io (lpr t |) UFR oe: 4g-7:* 8°80 | 
Boer | Oe ae oe) r EO in - : 
1-105 | a3 | 9 yt 3-6 | 1:38 %  [} +10° | 4410 
lee .. % | ia ae « 


| 
~I 
rey 
o 
| 
bo 
iv) 
a | 
2 
| 

| 
ie 2) 
ns 
_ 
“I 

















R. A. FisHEer 





513 


6. The direct process of integration by parts applied to such expressions as 


n—4 


| z a-r) 7 


° n-4 
or +1 eo on @ 
aai dr and wf (l-?r*) ~ 7 ai 9 dr, 
. . . E » 
when n is even, merely introduces the sums and differences of the terms a> 2 
rT — 


at the extremes, where 7 is —1 or +1, with coefficients which are, in any 
particular case, easily calculable. 


Thus, » being 6, 


‘gs aN, Sa 
I (l-1) odr= ja 


+1 


—7 0 & + ‘On oe 
) Or 2h ~ Ors 


e? 
2 


+1 


-1 


[oa & 
& or? 2 


+1 


at | 


3 
= 2 x the sum of the extreme values of =, (0 — 3 cot 0 + 30 cot? @) 


— 2x the difference of the extreme values of 


If p=sina, so that the extreme values of @ are 


£ 


sin? 


2 


T 
——aand 


8 qd — Ocot 8). 


T 
> +4, the sums and 


“a 


differences may readily be expressed in terms of a, and the first few may here be 
tabulated: the table has been carried back as far as is necessary for the calculation 


of the fourth moment. 


sin?6 (7 +26” | 39 onkis =< = cot? at 





sin 6 { ( 3) ) 
6+(1-— )cotat 

p | eae 

i 

2 

2e> 

sin 6 
e. -—@co 

sin? (1 sates 





3 
(6-3 cot 6 +36 cot? 6) 


_p* 


sin‘ a 


— 96 cot 6+15 cot? 6-158 cot? 6) 


There are here two 
differences ; the simpler, 








sum 


m7 cot a (1 +a tan a) 
q +a” 

nm tana 

2 tan? a (1+a tan a) 


w tan*® a (143 tan? a) 


2tanta(4+9atana+ 15tan?a+ 15a tan®a) 


| 


| 


| 


| 


difference 


a cot? 6 


m1 (a+3 tan a+3 a tan*a) 


| cota {24 —2tana+ (+2) tan a} 


Tra 

2a tana 

a tan? a 

2 tan? a(a+3 tan a+3a tan? a) 


an tan‘ a (9 tan a+15 tan3 a) 


natural series, which appear alternately as sums and 
which may be expressed in the form 


oe 
— Sin? 


2 


0 y 
a -)a@ 
& ada 4 








514 Distribution of the Correlation Coefficients of Samples 


is essentially a series of Legendre functions of the first kind; and may be 
expressed as 


|p 


-—1 
5 . tan? a 3 Py (1 tan a) ; 


and it is these only which occur in the evaluation of the even moments. 


7. It is, however, desirable to obtain general expressions for these integrals 
in terms of n and p, and to evaluate them when n is odd. 


For this purpose let us introduce a quantity ¢, such that 
cos @ = cos 0 — k, 


then, when & is sufficiently small, we may expand ¢? by Taylor’s theorem, so that 





O_o 
2-2” sin 000 2 * [2 (name) 2+ 
Now let k=phvV1—1, 
: ¢?_ & _ 8 @, PR(I=*% (. 0 ) & 
Base g~gtevl—-* og at fe \snon) 2* 
and differentiating twice with respect to h 
2 2 é Vp _ 2( 2 0 Ve ‘ 2\3 é °& 
ep u—*) (sams) y=) (gag) 5 ther —o} (Se 538) gh er 
whence, dividing by (1 — rye, we obtain 
Be Se RS Pe a ( 4a (s-908) e 
Ji. (aa $34) . he (1 — 7) \sin 790) 27 he sin 900) 2 
a ei Ve 
od |2 a=) (sin 700) 2 biker 
+1 i gn-1 ga 
, —p _7 : & Sed —_ 
so that E r? (1—r*) ami 5 dr 





may be obtained by multiplying by |x —3 the coefficient of h"-* in 


[” r?dr 1—¢gecotd 


a yr y > 


«=§ va ee r : sin? p 
when cos ¢ = cos 0— phV1 —7?=—p(r+hvV1—?°). 
* Our object might equally be achieved by the evaluation of the integral 


fr rdr/( a) 
tt wae ae : 


The quantity ¢ is determined by the equation 


cos ¢ = cos 6 — ph V1 — 7°, 
that is cos @ = — p(r+hV1—r°). 


























R. A. FisHer 


If now Yr =sin B, 
-h=tane, 
then cos = — psin £, 


cos $6 =—pV1+h?sin(8+e)=—pV1+h'sin P, 


and as r passes from —1 to +1, 


8 passes from -3 to +2, 


6 from = —a to o +4, 
, 7 7 
B from —5 te to 5) and thence to a +6 
and re) from <= a to a +a and thence back to 5 +4, 


where sina =pVi +h’, ¢ oscillates in the same manner as 0, with a somewhat 


greater amplitude, and slightly in advance in respect of phase. 
‘+14 de ; 
The expression p? 1—gcotd dr 


= sin? ce) Wa r 
may now be reduced to 


ef Set ag eT. ate ( ee , Pain ag sin 8 ) ag 











is sin? d 1 —sin?a ' sin? 9’ (1 —sin* a’ sin® Bp’) 
= / +5 +6 
=e [5 2 mp? f sin a’ sin 8’ dp’ 
=4 — sin? of area (1 —sin*a’ sin? 6’)? 


rel" (p)s sin a’ sin f fe 
~7 (1 —sin’a’ sin’ 8’)? 











2 pe eae aa Ye 
=P" + —— (=) + a ,(1— cos a’) 
cos @ cos?a@ \cosa cos? a’ 
- p?3r (1 2 sin a), 
cos? a’ Cos a 
but cos? a’ = 1 — p?(1 + h*) = cos’ a — sin’ a tan’e, 
+11— pcot dr a tan? a 
so that Pp 4] $ b.. 8 a 
+. a d J/I—- ~1—htana 


From this evaluation we deduce the general form 


n- 
‘e (_l-r 2 3 g ' dr = |n — 3a tan" @ .......... 2.08 (IIT). 


Biometrika x 66 





516 Distribution of the Correlation Coefficients of Samples 


The absolute frequency df, with which r falls in the range dr, is therefore 


n—t n—-4 


C-) —a-) © (55) 


—— dr. 
T\n—3 sin 60 sin 0 





8. I do not see how to integrate the other expressions of the type 
~[ttl—qcotd r’dr 
p '; ~ sin? V1— 7’ 
although a form could probably be obtained when p is even. The general 


expression for the second moment may, however, be deduced by means of a 
reduction formula. 


By a process of integration by parts it appears that, if we write 


és oe 
| i (l-r) * » La . dr=Iy,», 

then Tnse.2 = Inse.o+ In.9 —n(n—1) In, 2, 
and since Lg Bar en —tana+ a) ; 
we may obtain successively 

a es - = ‘~éaraes a) 

4 3 / 
I,.g= 1202 (=F gh = 2 oa ve tan a + a) : 
6 5 3 


and so on, yielding, when n is even, the expression 


a 


Tn.g =In.o —7 |n — 2{ tan” edz, 


0 


a form which may well hold when n is odd. 


The above expressions are useful in tabulating the numerical values of the 
second moment, 7+ 0°, of the unit curve, which may easily be calculated in 
succession for different values of » when tan*a is taken to have some simple 
value. 


9. Before leaving this aspect of the subject it is worth while to give a more 
detailed examination of the mean of the frequency curves of r when n = 4. 


Two formulae are arrived at by Mr Soper, which are equivalent approximations 
of the second degree 
mong 1-/’ 3 ‘ont [ Ll { 3 na 
I, F=p|1- on {1 + dn +39) |=p | 1 — ‘bbe ig ( + 3p?) |> 
1 


ag Pe ae Ee 1 me — p? ( oe 
II. F=p|l 2(n-1) |! sazay- 9% |=e[1- 6 {1-791 opt | 























R. A. FIsHer 517 


and these we shall compare with the form 


III. Fa" (a + cot a— acot?a), 

p | "1000 | -2000 | *3000 | -4000 | -5000 | -6000 | ‘7000 | -8000 | 9000 | 9500 
I | 0853 | “1710 | 2578 | 3463 | -4377 | 5333 | 6347 | -7443 | 8649 | 9304 
II | 0847 | “1697 | 2555 | 3419 | 4310 | 5241 | 6236 | -7330 | 8566 “9254 


II 0850 | °1704 | 2570 | 3451 | -4360 | 5301 | 6290 | -7357 | 8540 | 9209 
It will be observed that the approximations lie on either side of the exact 
value over the greater part of the range, and that the error of the first 
approximation increases up to the value when p="9. The second formula 
gives the correct value somewhere between ‘8 and ‘9, and is thereafter too 
large. 


For the particular case p = 6608, 
I find (formula IIT) 7 ='5897, nearly the maximum difference from p, 
Mr Soper gives (p. 109) the value 5933 
and the experimental data ‘5609. 


The two theoretical values are much nearer to each other than either is to 
the experimental value. On the whole, it is obvious that even in this unfavour- 
able case Mr Soper’s formulae possess remarkable accuracy. 


10. The use of the correlation coefficient 7 as independent variable of these 
frequency curves is in some respects highly unsatisfactory. For high values of r 
the curve becomes extremely distorted and cramped, and although this very 
cramping forces the mean 7 to approach p, the difference compared with 1 —p 
becomes inordinately great. Even for high values of n, the distortion in this 
region becomes extreme, and since at the same time the curve rapidly changes 
its shape, the values of the mean and standard deviation cease to have any very 
useful meaning. It would appear essential in order to draw just conclusions from 
an observed high value of the correlation coefficient, say ‘99, that the frequency 
curves should be reasonably constant in form. 

The previous paragraphs suggest that more natural variables for the treatment 
of our formulae are afforded by the transformations 

pe 
V1—9 

Pp 
Vv1—p? 


¢=tan B= 


rT=tana= 


The expression for the frequency curve (II) 
n—4 


, , a ¢ / M) n—-1 G2 
e fad hea 


66—2 





518 Distribution of the Correlation Coefficients of Samples 


(= 0 a 6? dt 

cn BAL -_ > que 

sin 000 2 (lees? 

and the range of the curve is extended from —# to +. 


now becomes 


It is interesting that in the important case, 7=0, the frequency reduces to 
dt 
“=i and the curves are identical with those found by “Student” for z, 
(1+@)? 
the probability integral of which he has tabulated in his first paper. 


11. The moments of these curves are obtained by the evaluation of the 
expressions 


f ( ) a. 6: dt ¥ ( 0 . e: tdt 
aa oem S com Ne ecameeer = 

-« \sin@0@ 2 (+e? 7° sin 000 2 Q+n7 

and so on; of these the first is known already (III) to have the value 





7 \n—3 
Sains 
(1 oa p”) 2 
and the others may be obtained in succession, for 
| sill e tdt pile i 1 re 
In.p = = hess 


33 n—1 D> n-1~ Apn—l ~ a1 
-o (sin 800)" 2 , dp +7 


(1+#)? 


=a ye 2 


ha | eee, 
rt). 3° er Ope 
so that the first moment 
rp BX al tdt a @mn—4 _mwn—4(n—2)p° 
» \sin@d@/ 2° 1 


In~».0 ’ 


ice n—1 = a . n—2 n—4 ? 
(1+)? op (1 — p*)? (1 — p’)? 
n—2 p n—2 


t= —— 
hence n—-8Vi-~@ a—3 T 





The mean, therefore, is greater than the true value 7 by a constant fraction 
of its value. And this fraction decreases in the simplest possible manner as n 
increases. 


In the same way, we may evaluate the second moment, 


1 
P+o°= {1 +(n—1) 7°} 


n—4 
ee Se ae. 
and gilt % ahaied + eed ats 
the third moment 
ae _(n —2)r 27r*(n—1)) 
_ (n—3)(n—4)(n—5) sa+e)+ (n—3) f? 


and the fourth moment 























mea 


| 





| 72 2 00 ‘01 03 





23 | 3°3529 | 3°3612 | 3°3768 
33 | 32222 | 3-2271 | 3-2365 | 3-2667 | 3°3343| 3-4619| 3°5773/| 3°6493| 3°6756| 3°6856| 3-6899 





R. A. FISHER 519 


3 : 6(n — 2)(3n?—11n + 12)7* 

Biot= ayaa {att + ae 

For high values of 2, all but the first terms tend to vanish ; 8, tends to vary 

as p*, and 8, tends to become independent of p. In effect for high values of 1, 

where p? is nearly equal to unity, the form of the curve is nearly constant, but the 

skewness measured by #, decreases to zero at the origin, and changes its sense, 
when 7 and p change their sign. 


6 (n—2) 7? 
(n—3)(n —5) 





(l+7°)+ 


Tables are appended for inspection rather than for reference which show the 
nature and extent of these changes in the form of the curves. 


Table of o°. 





*30 | 1°00 3°00 10°00 30°00 100°00 





| 
} 
| 
8 | °2531 | :2593 *2810 *3430 5600 | 1140 | 3°350 | 9°550 | 31-250 
3 | °1123 *1148 "1234 "1481 "2344 "4811 | 1°344 3811 | 12°444 


8 | -07219 | 07372 | -07908 | -09438 | -1479 3010 | -8365 | 2°367 | 7-722 | 

















1 
1 } 
23 | 05319 | 05429 | -05817 | :06925 | :1080 | ‘2188 6066 | 1°714 | 5°592 | 
33 | °03484 | 03555 | 03805 | 04518 | °7015 | "1415 3912 | 1105 | 3°602 
43 | °02590 | °02643 | -02827 | -03353 | -05194 "1045 “2886 *8146 | 2°655 
5% | 02062 | 02103 | -02249 | -02666 | 04123 | -08288| -2287 6451 | 2°103 
‘oa | | es | me | { 
‘able of B,. 
l a 
10°00 | 30-00 | 100°00|  @ 


=| ‘Ol | 03 ‘10 |- +30 | 1:00 3:00 


Soa, WER ™ sta. Ro 


| 
53 | 4184 | 4-252 
1 

















\n= | | | | | 
8 -05685 | -1662 5076 | 1:230 | 2°450 |3°788 | 3-965 | 4:1 
13 | 01517 | -04776 | -1376 | -3400 | -7058 |1°018 | 1-205 | 1-271 | 1-296 | 13065 | 
18 | ‘008399 | 02463 | ‘07645 | *1914 | -4016 | ‘5857 | -6990| °7395| -7546| °7619 
23 | 005757 | 01691 | °05247 | °1317 | ‘3016 | -4093 | -4910| °5208| °5314| °5361 | 
33 | 003518 | 01035 | 03214 | ‘08100, ‘1731 | 2559 | *3031/ °3260| -3334| “3366 | 
| 43 | 002530 | 007435 | 02315 | ‘05841 | °1251 | °1858 | °2237| -2376| °2429| -2452 
| 53 | 001973 | 005798 | 01807 | -04562 ny "1458 | "1757 | *1868 | "1910 "1928 | 
| | | 
Table of Bz. 
| | | | 
i eae | 


¢ Buadh: skit Dek Pau Dee ake 


1:00 | 3°00 10°00 30°00 | 100°00| @ 
| 








r= | | | | 

8 | 6:0000 | 6°1137 | 6°3179 | 7°0179 | 8-4767 | 10°9668 | 12-9652 | 14°1116 | 14-5024 | 14-6508 14°7159 
13 | 3°8571 | 3°8802 | 3°9248 | 4:0663 | 4°3770 | 4°9397 5°4240| 5°7147| 5°8186)} 5°8578| 5°8750 
18 | 3°5000 | 3°5121 | 3°5356 | 3°6104 | 3°7937 4°0828 | 4°3532| 4°5186| 4°5783)| 4°6009| 4°6109 








43 | 3°1622 | 3°1656 | 3°1723 | 3°1938 | 3°2422 | 3°3261 |] 3°4172)| 3°4692 3°4886 | 3°4958 | 3°4991 
53 





3°1277 | 3°1303 | 3°1356 | 3°1522 | 3-1898| 3-2640 33281 | 32676 | 33826 3'3888 | 33909 
L ee. 











3°4271 | 3°5556 | 3°7486 3°9356 | 4-0511 | 40930 | 4°1089 | 4°1159 











520 Distribution of the Correlation Coefficients of Samples 


12. The fact that the mean value 7 of the observed correlation coefficient is 
numerically less than p might have been interpreted as meaning that given 
a single observed value 7, the true value of the correlation coefficient of the 
population from which the sample is drawn is likely to be greater than r. This 
reasoning is altogether fallacious. The mean 7 is not an intrinsic feature of the 
frequency distribution. It depends upon the choice of the particular variable r 
in terms of which the frequency distribution is represented. When we use ¢ as 
variable, the situation is reversed. Whereas in using 7 we cramp all the high 
values of the correlation into the small space in the neighbourhood of r=1, 
producing a frequency curve which trails out in the negative direction and so 
tending to reduce the value of the mean, by using ¢, we spread out the region ot 
high values, producing asymmetry in the opposite sense, and obtain a value ¢ 
which is greater than r. The mean might, in fact, be brought to any chosen 
point, by stretching and compressing different parts of the scale in the required 
manner. For the interpretation of a single observation the relation between 
¢ and 7 is in no way superior to that between 7 and p. The variable ¢ has been 
chosen primarily in order to give stability of form to the frequency curves in 
different parts of the scale. It is in addition a variable to which the analysis 
naturally leads us, and which enables the mean and moments to be readily 
calculated, and so a comparison to be made with the standard Pearson curves, but 
it is not, with these advantages, in a unique position. In some respects the 


function, log tan 4 (a + 3) is its superior as independent variable. 


I have given elsewhere* a criterion, independent of scaling, suitable for 
obtaining the relation between an observed correlation of a sample and the most 
probable value of the correlation of the whole population. Since the chance of 
any observation falling in the range dr is proportional to 


n-1 n-4 9 1g 
aa Pe C n— 6° 
(1—p*) (i—#) (5-90) 2 


“ 


dr 


for variations of p, we must find that value of p for which this quantity is a 
maximum, and thereby obtain the equation 





n-1 
r] Pe eae M,) n—1 G2 
dp \a-p") (a goe) st =. 
Since I — = = —+( : ‘a 
o (cosh a + cos 0)" — |n — 1 \sin 6 00 2 
2 9 n-1 ] 
h = — p*). * Post) He 
fem eae I, Op {a p’) (cosh # + cos a se 


* R. A. Fisher, “ On an absolute criterion for fitting frequency curves,” Messenger of Mathematics, 
February, 1912. 
































R. A. FISHER 521 


which leads by a process of simplification to the equation 


os - da: 
= (re ha) =0. 
[ wha (r — p cosh 2) 

Since cosh # is always greater than pr, the factor in the numerator, r—p cosh a, 
must change sign in the range of integration. We therefore see that r is greater 
than p. Further an approximate solution may be obtained for large values of n. 
The integrand is negligible save when @ is very small, and we may write 


1+ = for cosh x 


2 
na? 
and (1—- pr) el Pr) for (cosh # — pr)”. 
ee a. Ae 
Then r[oe "O-) da =p (1+5)e a0 -) dee 
0 0 


and in consequence, as a first approximation, 


The corresponding relation between ¢ and 7 is evidently 


1 


It is now apparent that the most likely value of the correlation will in general 
be less than that observed, but the difference will be only half of that suggested 
by the mean, ?. 


It might plausibly be urged that in the choice of an independent variable we 
should aim at making the relation between the mean and the true value approach 
the above equation, or rather that to which the above is an approximation, or 
that we should aim at reducing the asymmetry of the curves, or at approximate 
constancy of the standard deviation. In these respects the function 


log tan 4 (2 + 5) that is, tanh p 


is not a little attractive, but so far as I have examined it, it does not tend to 
simplify the analysis, and approaches relative constancy at the expense of the 
constancy proportionate to the variable, which the expressions in 7 exhibit*. 


* [It may be worth noting that Mr Fisher’s ¢ is the @-square root mean square contingency—of the 
more usual notation, and is the expression used in determining the probability that correlated material 
has been obtained by random sampling from uncorrelated material. Ep.] 








ON THE DISTRIBUTION OF THE STANDARD DEVIATIONS 
OF SMALL SAMPLES: APPENDIX I. TO PAPERS BY 
“STUDENT” AND R. A. FISHER. 


(EDITORIAL.) 


CONSIDER the population distributed according to the law 
ye At 
ra * 
and let a sample of » represented by the variate values 2,, 2, ... 2, be taken from 
it. Then the probability §P that this sample will lie between 
av, and 2,+6a,, a and a+ 8a, ... %, and x, + 6&2, 


S (a,—m)? 
N* pas esata: Eb 
o 82, 82, eee Say 





i OP = 
¥ (W2Qar)" o” : 


= const. x é o +) oe i vacees sical (ii), 


where %= : S(a,). If }'= 2 S(a,—%)? we may write: 


n=? n(%-m)? 


-4 2 =f 2 eee 
SP =const. x e ( : . ) ee i i nunadosasees (iii). 


Changing as Mr Fisher does (see p. 510 above) to % and & as coordinates 
we have: 
n=? n(Z—-m)? 
z+ 


8P =const.xe ( oe oe ) >"-2 8% 8>. 


We see at once from this* that the law of distribution of samples of means is 


the normal curve 
-4 n (Z — m)? 


Y =Yoe © + awecidgighi veensakecieerantel (iv) 


* Of course the form reached above shows that for normal distributions there is no correlation 
between deviations in the mean and in the standard deviation of samples, a familiar fact. 











EDIrorRiAL 523 


with mean =m, the mean of the population, and with standard deviation 
=o0/,/n, a well-known result. 


On the other hand the distribution of samples of standard deviations is 


n=? 
y=ysre* Be iranian sa cnainea sale en 

This curve was first reached by “Student” as a highly probable result 
following from the relations he had obtained from the moments of >?*. 
Mr Fisher's work thus enables us to justify “Student's” assumption. 
“Student” has discussed at some length the distribution curve for =. He 
has obtained the valnes of the moment coefficients w,, w, and mw, and the 
general expressions for the means when n is even and odd. The whole problem 
is of such importance that it seems worth reconsidering, and providing tables 
showing the approach of the distribution curve to normality as n rises from 
4 to 100. 


The following investigation largely repeats work given by “Student,” but it 
expresses the values for u;, w,, and 8, and #, in a different form+t. We shall not 
use approximate expressions for the constants, for the order of terms in 1/n 
depends so largely on the relative magnitude of their coefficients, that such 
expressions become unreliable for values of n under 100. 


Clearly (v) is a skew curve with range limited at one end, >=0, and not at 
the other, = 0. See Figure p. 524. 


We shall write the standard deviation of 5, os, and the moments of the 
frequency about the end of the range O as M,’, M,’, etc., while the moment- 
coefficients about Q will be as usual w,(=0), ww, etc. Obviously y,=c3s*% It is 
desirable to ascertain >, 5, cs and the skewness as well as f, and #, for the 
distribution. We do this to show the rapidity of change to a normal distribution. 
It is well, however, to notice a@ priori that for n large the distribution does become 
normal. 


* “Student’s” approximate values for 8; and 2 (loc. cit. p. 10) are, we fear, erroneous. He gives 
re a a ‘ae SS ‘ 
D?=n-5+ Sa? but it is needful to have a further term in — in order to obtain 8; and f: correctly to 
2° 8n n2 


the second approximation in =. If this further term be p/n”, then: 


64p -3 ; ee | 3 
By (a + =. *), as against ‘‘Student’s” In (1 - ). 


~ On 4n 


1 1 
Be 3+ ~ ” ” ” 3 (1- in) 
n 
An examination of our table (p. 529) shows that ‘*Student’s” corrections are not of the right sign to 
agree with the facts, and that further no constant value of p would give good results even for fairly 


high values of n, i.e. it is probable that the term in = in D? is of equal importance with that in D 
+ “The Probable Error of a Mean,” Biometrika, Vol. v1. pp. 1—25, more especially pp. 4, 6, and 
8 to 10. 


Biometrika x 67 








524 Standard Deviations of Small Samples 











sa —m, 
' 

' 

} 

' 

' 

! 

\ 

1 

1 

1 

/ | 

sf silent 
O y.& > 
OP =mode=5, OQ=mean=. 


To obtain this approximation to (v) let us assume ~=2+e, and suppose 


n 


small. Expanding log y we find: 


log y = log yo + (n — 2) log 3 — gn (S/o + =" (1-5 =) e 





ne* n—2 o°\ er 
——-(1+4 >) + terms in e*. 
n 2 


and thus: 


~ Ae ee) ee 
Y=Yr7e — We ~ a /(2n) + ete. 


Or, if « be small compared with o, the distribution is the normal curve: 


} ail 


y=y € Sie Se ee nls oe dha in rds Pe .(vii), 


2 
P i—-4 Tee a > 
with mean at >= a “ay and standard deviation o/V2n. If n as usual be 


considerable, this agrees with the ordinary result, ie. Y=o and cy =a/V2n, the 
distribution being treated as normal. 


We will now deal with the full result (v). We have: 


= - _ 22? 
M,' = y=Pd> = y | rere Oe cc. icc tecewead (viii), 
J0 0 
and clearly M, depends on a knowledge of 
— = 
L,=| TM © uo eccin us indceeamen aude (ix), 
0 


for we have: 











EDITORIAL 525 


Integrating by parts we find: 


Ly = (q—31) Lq-2 
...3.1 Ly, if g be even 


=(q—1)(q—-83)...... ar git ; 
SAE hs Ee ee oe (x) 
Now Ly = Ee bu du => V3 > 
0 2 
and I, =| ue~ "du =1, 
0 
thus M,’ is determined, and will depend on whether n + p be even or odd. 
r o \en o \"t, 
But Mi=y (5) In=n (SZ) (=I) Les, 
P oC n—1 
Mi= (5) Ln 
, , / ,t— 1 ° 
Hence pe = M,/M, = o° mT TENIE Et ttnateneeenneeenees (xi). 
To find the modal value 5 we must differentiate (v), and we have 
n= ~~ n= 
~, -} = Ny m™ -34— 
(n—2) S»-%e * o® _—— $n-2¢* oF =O, 
o 


which gives 


2 _—3 rn 
>= es ECS ee RE ee Lee (xii), 


a result in agreement with the mean = of the approximate solution (vi), as we 
should anticipate. 
It now remains to find M, and M,’ absolutely 


Suppose n even, then 
Tn+a=(n — 2)(n—4)...2 x1 


acuseunats “Seereees (xili). 
Ln». =(n — 3)(n—5)...1 x J's 
Hence for n even 


x —2n—4 2 2 ‘ 
2=f, = M,’/M,’ = =o : — coe 1 J= sccoeesecece (x1VA). 


/rnn—3n—5 
Again for n odd 


Dy =(n—2)(n— 4)... 1X, = 
In. =(n — 3) (n— 5)... 2 x1, 
and hence 
= _ @ (n—2)(n—4)... 1 7 o 
we ei ee (xivB). 


67—2 








526 Standard Devictions of Small Samples 


These accurate values of =, the mean standard deviation of samples, were first 
given by “Student” (loc. cit. p. 8). Now by Wallis’ Theorem 


ms Fhe 4 1)6 ee 
Q\“" ~ product of odd numbers up to 2n — 1° 


Thus (xiva) for n large tends to become 


Zu—Va_-i=e n—1 


Vn n 
: . —2 n—2 
and (xivB) Sal Sa J : 
( Vn Vn —2 ¢ n 


These values, however, really only suffice to show the approach of & to a, as 
they depend on the neglect of terms of the order * as compared to 1, and we should 


get absurd results for os? by subtracting the square of the above values of = from 
#2 in (xi). All they really tell us is that for n large 2 =o, but they give no true, 

: Fee ee 
approximation in -. 


If we use Stirling’s Theorem up to the third termf, Le. 


l= V2ma ate (1 + . + . ), 





12a ° 28822 
; 3 7 
bte = _ SRE dav nabatcee hie abetee . 
we obtain S=c (1 - aaa) (xv) 
a oe. 1 

Cs = Dn 1- ia) PTOTTUTERELELIO EET (xvi), 

aX i. 

Ba dnt? Pe at 


51840a° 
expression to reach the second terms in yp; and w,. As we have indicated (p. 523, 
ftn.), such a term, even if used, will not lead to profitable results. It is better 
to work with the full formulae. It is desirable to find the full third and fourth 
moment coefficients in order to determine §, and #, and so measure as » increases 
the rapidity of approach to the normal curve. 


But we should be compelled to introduce the term — into Stirling’s 


* Student” has used an extension of Wallis’ Theorem, which will suffice for certain constants only. 
+ We can write (xiv) 
o (ain |4n-1)? 19 


me |n-2 Tv 


M! 


= 
al 
< 
_- 
= 

— 


and (xivs) 
n—- 


c (n —2)| 3 = = 
= Ta (a9 [4 (n - 3))2 /F @ cecaweceronseweseenveseeedesseues (xviii), 


SS 
a 


and then apply Stirling’s Theorem. 


























EDITORIAL 527 
We have: 
n+3 n+3 
Mi =% (73 Late =Y% (3) (n+ 1) Ln 
= = (n a 1) My. 
, 2 F 4(n?-—1 : 
Hence pd = Mi/M, = (n +1) pf = = A ee (aia, 
n+2 n+2 
My; =% (<..) Dav = Y, (<-) 2Dn- 
= eM 
n 
Hence Ge es Fie Oe wiiis risckis cece (xx). 


Transferring to mean: 


= , 9 , , , 
Ms = bs — Sef, — Me fi 


o:? 


= (1-35) , eeneale teat De ORO Eee (xxi). 


° " 1 
Thus p, will grow small, not only owing to the factor = but because o3* tends 


to equal o*/2n as n increases, 


(a os? 

2 452 as 

es o% o?/2n 

Now By = Po = — —_—_——-. 
ps? n2 os® 


S\/, ast \// ost \3 
~s,(~ ee 2 s = 
Or B, = 8n (=) (1 a Ke) pevcdereratessnenense (xxii). 


Here /o is of the form 1— x, and 





and thus 8, tends as n increases to take the form 8y,*/n, but as y. may be a 


considerable numerical coefficient 8y.? may be commensurable with n till n is very 
consideraple. 


We next turn to y,, and shall endeavour to express it in terms of p.=as°, 


Since fs! = fp + 3? = 0? ( ai *) 
n 


by (xi), we have 











528 Standard Deviations of Small Samples 


Further by (xx) 


1 
bs pr’ a o*u,” =o (1 = “) ‘hie Oo. 
Thus 
Pas = Pa — Apts’ pa’ + 6 pte’ my® — By," 


=ot{1--4(1- 7) +4446 (1-2) (1-=-4) 
nn nr oC 





n OF 
ate 2 
-3(1--) + (1-2)-3 4 
n ad n Co 
( 5 1 3 oc: 3 cs? \2 
aa vg ae ee, ee, Se ae 
wa |4n? Qn (4 =) (1 am) ant (1 sea | Hues 


Hence 


1 f 3\ / co: o;* \? re 
= 5—2n(4—-—-){(1— a) —8 (1-- =) .»(XX111). 
Bs {os?/(o*/2n)}? \5 \ n} \ o?/2n o?/2n ( ) 
Our results for »; and p,, although expressed in other notation, are in 
accordance with “Student’s” (loc. cit. p. 9), so also are our results (xv) and (xvi) 


although reached by a different method of approximation. We do not agree with 
his approximate values for y;, 4, or 8, and ,. 


The calculations to find 2/o, os/(a/./2n), 8, and B, presented some trouble. 
In order to be correct to the four figures of decimals in the tabled results, tables 
of ten-figure logarithms had to be used in the logarithmic part of the work. 
Formulae (xvii) and (xviii) of the ftn. p. 526 were adopted, using Degen’s Tables of 
the Logarithms of Factorials. M,’ was calculated to nine figures, and even then, 
as n became large, the determination of the antilogarithms presented consider- 
able difficulty. Further the powers of 1 — cs*/(o*/2n) gave rise to trouble. The 
numerical work was undertaken by Ethel M. Elderton and Beatrice M. Cave, 
to whom very hearty thanks are due. We think the results may be depended 
on to the figures tabulated. 


It will be seen that by the time n =50 the mode is as close to the mean as we 
should expect to find in any random sample of normal material; the average 
mean > is only 1°5°/, from the usually adopted value o, and the average standard 
deviation os only 0'3°/, from its customary value o|V2n. Further 8, and ~, are 
‘0105 and 3:0003 respectively, or for all practical purposes have reached their 
normal values. We think it must be concluded that for samples of 50 the usual 
theory of the probable error of the standard deviation holds satisfactorily, and 
that to apply it for the case of nm =25 would not lead to any error which would be 
of importance in the majority of statistical problems. 


On the other hand, if a small sample, n < 20 say, of a population be taken, the 
value of the standard deviation found from it will be usually less than the standard 
deviation of the true population. If we take the most probable value, 3, as that 

















EDITORIAL 529 


which has most likely been observed, then the result should be divided by the 
number in the column entitled mode S/o to obtain the most reasonable value for o. 
For example, if = be observed, and n = 20, then the most reasonable value to give « 
is 2/9487. 


The paper by Mr Fisher and the accompanying table more or less complete 


the work on the distribution of standard-deviations outlined by “Student” in 
1908. 


Table of Values of the Constants of the Frequency Distribution of the Standard 
Deviations of Samples drawn at random from a Normal Population. 











| a ies 
l . wwioti Measures of Deviation from 
Sise of | | | Standard Deviation | Normality 

Sample Mode Mean = : 
n Zle | Zo | | 
| | oso o3|(o/2n) Skewness By Be | 
} | 

4 | ‘7071 | ‘7979 | °3367 "9524 *2696 *2359 3°1082 

5 ‘7746 | °8407 *3052 ‘9651 ‘2168 “1646 3°0593 

6 ‘8165 | ‘8686 *2808 *9725 "1857 *1255 3°0370 

7 *8452 *8882 ‘2612 ‘9774 “1648 “1011 3°0251 

8 8660 | +9027 "2452 ‘9808 | 1495 0845 3°0181 

9 | °8819 | :9139 2318 9834 "1378 0725 3°0136 

10 "8944 | °9227 *2203 "9853 *1285 0634 3°0106 

11 *9045 ‘9300 *2104 “9868 *1209 "0564 3°0085 

12 9129 | ‘9359 ‘2017 “9881 “1144 *0507 3°0070 

13 ‘9199 “9410 “1940 *9891 *1088 0461 3°0059 

14 "9258 "9453 1871 *9900 1041 | °0422 3°0049 

15 “9309 *9490 “1809 *9907 0998 | *0390 3°0042 

16 9354 | ‘9523 *1752 ‘9914 ‘0961 “0362 3°0036 

17 "9393 9551 | ‘1701 ‘9919 ‘0927 ‘0337 3°0032 

18 "9428 "9576 "1654 “9924 ‘0897 ‘0316 3°0028 

19 *9459 "9599 "1611 “9928 ‘O869 “0297 3°0025 

20 "9487 ‘9619 *1570 *9932 ‘0844 “0281 3°0022 

25 "9592 ‘9696 *1407 “9948 “0745 0219 30014 

30 ‘9661 ‘9748 "1285 9956 ‘0674 0180 30009 

35 | :9710 "9784 1191 9963 “0620 0153 3°0007 

4O | 9747 ‘9811 1114 “9967 0577 0132 3°0005 

4S | ‘9775 9832 1051 9977 0541 0117 3°0004 

50 | °9798 “9849 ‘0997 ‘9974 “0512 ‘O105 3°0003 

58 ‘9816 ‘9863 0951 ‘9977 “0488 ‘0095 3°0003 

| 60 | +9832 ‘9874 0911 ‘9979 0467 ‘0087 3°0002 

65 "9845 “9884 ‘O875 ‘9980 "0447 ‘0080 3°0002 

70 ‘9856 “9892 “0844 “9982 ‘0430 “0074 3°0002 

75 | "9866 “9900 “O815 ‘9983 “0415 “0069 3°0001 

80 ‘9874 9906 ‘0789 “9984 0402 “0064 3°0001 

85 | ‘9882 ‘9911 ‘0766 *9985 “0389 “0060 3°0001 

90 | “9888 ‘9916 0744 ‘9986 ‘0378 ‘0057 3°0001 

95 “9894 ‘9921 "0725 "9987 0367 0054 3°0001 

100 | "9899 “9925 ‘0706 “9987 0358 “0051 3°0000 











TUBERCULOSIS AND SEGREGATION. 
By ALICE LEE, DSc. 


(1) In his book The Prevention of Tuberculosis (London: Methuen, no date 
on the issue we have used) Dr A. Newsholme has examined the influence of 


segregation on Tuberculosis. This is the topic of Chapter xxxv. In the opening 
of this chapter, he writes: 


The exact measure of institutional segregation of phthisis is the ratio stating how many of 
the total days’ of sickness (number of patients and number of days of sickness) are passed in 
institutions. This ratio and the equivalents for it which have to be used in practice may 
for convenience be called the segregation ratio. The need for equivalents for the ratio as stated 
above arises from the fact that we are dealing with actual recorded experience, and the material 
has to be taken from the records as they happen to exist. (p. 266.) 


After noting the incompleteness of the records, Dr Newsholme continues : 


It becomes necessary therefore to select other figures which vary approximately with the total 
days of tuberculous sickness and the total days of tuberculous sickness passed in institutions. 
(p. 266.) 


We shall discuss below what “indirect measures of segregation” Dr Newsholme 
selects, but he gives the following most proper caution with regard to them: 

In using these indirect measures of institutional treatment of tuberculosis and of its pre- 
valence it must be remembered that they are indirect and approximate. Thus, for instance, 
figures for institutional treatment usually give the number of cases and not days of treatment, 
and while they tell how many people were segregated in institutions do not show the average 
duration, still less the quality of the treatment. Any of these indirect forms of segregation ratio 
has therefore to be verified wherever possible by the application to the same community and 
period of one or more other forms of the ratio, and checked wherever practicable by a special 
examination of sample constituent communities whose figures are included in the total. (p. 268.) 

Dr Newsholme in the course of his chapter gives a number of very high 
correlations’ between the phthisis deathrate and the indirect forms of the segre- 
gation ratio he has selected, and he interprets these as well as a long series 
of graphs as demonstrating that institutional segregation has been a most 
important factor in the diminution of the phthisis deathrate. Now any two 
variates which are changing continuously with the time—say, the consumption 











ALICE LEE 531 


of bananas per head of the population and the fall in the birthrate—will exhibit 
high correlation and will show graphically very high association, if plotted to 
appropriate scales and on a common time basis. Until the time factor has 
been removed, either by partial correlation or otherwise, it would be most un- 
wise to interpret such cases as providing any causal relationship. 


It seemed accordingly worth while to reinvestigate Dr Newsholme’s problems 
with the aid of a rather more adequate statistical apparatus. 


(2) We must frankly confess at the outset that we have had great difficulty 
in following Dr Newsholme’s description of the methods he has adopted to 
measure the amount of segregation. His charts do not seem always in ac- 
cordance with his tables, and both are occasionally out of agreement with his 
definitions. As he does not give the raw data on which his correlations are 
based, but only condensed versions of them in his tables and graphs, it is 
impossible to test his conclusions without returning to the original souress, 
which are not always stated, and when we have found them and our resuits 
differ, we are unable to say whether the difference is due to failure in his or 
in our arithmetic, or to divergences between his and our records. 


Dr Newsholme uses in all some six measures of the segregation ratio, four 
intentionally and two apparently by inadvertence. 


Let P= total population of a given area, @ = the total number of annual deaths 
from phthisis. Then ¢/P multiplied by 10,000 or 100,000, as the case may be, gives 
the crude deathrate from phthisis. Let D; be the deaths from all causes which 
occur in institutions and D the total deaths in the same area, then 100D,/D is 
Dr Newsholme’s first approximation to the segregation ratio*. On p. 270 he 
gives two tables which show in (a) England and Wales as a whole, (b) in London, 
that, while in the course of forty years 1000¢/P has practically halved, 100D,/D 
has practically doubled. The data, Dr Newsholme tells us, show “not only a 
very close correspondence between the increase of total institutional segregation 
measured by the ratio in question and the decrease of phthisis, but an even more 
striking similarity in the ratio at which these changes have occurred” (p. 271). 
This is illustrated by a graph on p. 271, in which the logarithms of the phthisis 
deathrate are plotted to time against the logarithms of the indices of institutional 
deaths to all deathst. We do not know why Dr Newsholme has chosen this 
method of representation; it certainly, with his choice of scales, makes the two 
curves roughly parallel, but this does not demonstrate the “similarity in the ratios 
at which these changes have occurred.” For, if the actual values be plotted to 
the time, the curve of phthisis deathrate is convex and the institutional death- 
rate concave to the time axis, in other words while the rate of one is increasing, 

* The assumption made appears to be that for the period in question D; is proportional to the 
institutional deaths from phthisis,—a very big assumption. 


+ The logarithms of the ratios of institutional deaths to all deaths appear to be either wrongly 
plotted or wrongly calculated. 


Biometrika x 68 








532 Tuberculosis and Segregation 


the rate of the other is decreasing during the period in question,—always on the 
supposition that we plot the results as Dr Newsholme has done with reversed 
directions of increasing scales for the two indices. He states that “the experience 
is summarised in the high correlation coefficients of ‘91 for England and Wales 
(1878—1903) and ‘90 for London (1866—1904)” (p. 271). The correlations 
found from his actual tables do not appear to agree with these, being, for example, 
— ‘93 for England and Wales with the negative sign as we should anticipate; but 
as Dr Newsholme does not give the same years for his correlation coefficients as 
in his tables, he may have worked out his coefficients for individual years. It is 
impossible to test the matter, as neither the figures nor their source are provided. 


If, however, we take his Tables LXII and LXIII, and apply the variate 
difference method* to Dr Newsholme’s data as they stand in his book, which 
are all the data available, we find 


Correlation of Phthisis Deathrate and Ratio of Deaths in 
Institutions to Total Deaths. 
England and Wales: Third Differences — ‘174+ °293, 
London : Second Differences  —‘094+°252. 


In other words the data show no significant relationship between this measure 
of segregation and the phthisis deathrate, when the time-factor is annulled, even 
with the early differences. It is impossible to press the matter further because 
the data are far too sparse for difference treatment, but the results, such as they 
are, are sufficient to indicate that Dr Newsholme’s high correlations are solely due 
to the fact that both variates are continuously changing with the time. 


(3) As a second measure of segregation Dr Newsholme takes 100¢;/¢ and 
1000¢/P is then correlated with this, ¢; being the deaths from phthisis in 
institutions. On p. 275 Dr Newsholme gives very meagre data for Brighton, 
Sheffield and Salford in groups of years, six pairs of values for Sheffield, five for 
Brighton and four for Salford. It is thus impossible to test these for annulment 
of the time-factor, and no references are given to the sources of the original data. 


On p. 276 we read: 


Coefficients of correlation summarising this correspondence for long series of single years 
work out at 67 for Salford from 1884 to 1904 and ‘80 for Sheffield from 1876 to 1905 ft. 


If the arithmetical values be correct, they should certainly have negative signs, 
but even then they would not demonstrate anything but the increasing use of 
institutions and the decreasing prevalence of phthisis during the years in question. 


* Biometrika, Vol. x. pp. 179, 341. 

+ These values might be modified if we could go to higher differences, but this is impossible on the 
very limited data which Dr Newsholme provides. On these data all we can state is that no evidence 
of organic relationship betwee the variates, such as is asserted by Dr Newsholme to exist, can be 
demonstrated. 

} There is no statement as to why Brighton has been omitted. 


enemas 


Atice LEE 5338 


There is, however, a much graver criticism to be made of Dr Newsholme’s 
method in this measure of segregation. He proposes to correlate 


100¢;/p and 1000¢/P, 


and interprets the high correlations as a sign of the value of segregation in 
reducing the phthisis deathrate. We have not his data to test his conclusions by, 
but we can compare them against certain results for 838 years in (i) England and 
Wales, (ii) Scotland, and (iii) Ireland. Here they are : 


Correlation of Phthisis Deathrate and Ratio of Institutional 
Phthisical to all Phthisical Deaths. 


Years District Correlation 
1866—1903 Scotland — ‘9815 + 0040 
1866—1903 England and Wales — ‘9750 + 0054 
1866—1903 Ireland — ‘8720 + 0262 
1876—1905 Sheffield — %80+°0443 
1884—1904 Salford — 67+ 0811 


The reader may imagine in this table a confirmation of Dr Newsholme’s 
results, for the larger material gives higher values of the correlations. On the 
contrary, these correlations have been obtained by taking as the measure of 
segregation the ratio 

Mean Institutional deaths per annum from phthisis 1866-1903 
x = 


aed Annual Total deaths from phthisis ies cia 


. 


Now it is clear that this index never varies with the increasing percentage 
of institutional deaths from phthisis. Yet all the correlations are greater than 
Dr Newsholme’s! We have little doubt that he would get higher values than he 
has done, if he replaced the actual institutional deaths per annum by the constant 
mean value. In other words the results reached by him are of no significance, 
for we get higher correlations by putting a single fictitious value for the annual 
institutional deathrate. 


The real source of his result is not the strong influence of segregation on 
phthisis, but the spurious correlation introduced by using the phthisis deaths, ¢, 
in the numerator of one variate, 1000¢/P, and in the denominator of the other, 
100¢;/¢. Thus no scientific results of value can be found from Dr Newsholme’s 
second measure of segregation. 


In discussing this second measure of segregation, Dr Newsholme lays great stress 
on the part played by asylums for the insane in segregating the tuberculous. He 
notes that tie percentage of lunatics treated privately with relatives and others was 
18°4 in 1859 and fell to 5°5 in 1902, thus marking increasing segregation during the 
period of fall in the phthisis deathrate. He states (p. 274) that: “the deathrate 
from tuberculosis in borough and county asylums in 1901 was 15°8 per cent. of 
the inmates, and over ten times as great as in the general population.” Now 

68—2 








534 Tuberculosis and Segregation 


Dr Newsholme’s figure appears to be quoted from the 56th Annual Report of the 
Commissioners in Lunacy, and in this case it should read 15°8 per 1000 and not 
per 100, and although Dr Newsholme appears to have made a similar slip in 
dealing with the deathrate in the general population, he seems to be comparing 
deaths from all forms of tuberculosis among the insane—some of which have 
possibly a direct relation to their insanity—with deaths from phthisis alone in the 
general population. Further he has made no allowance for the very marked 
difference between the age distributions of the two groups he is comparing. 
The difference is so great that a phthisis deathrate of 1°46 per 1000 in the 
general male population is equivalent to one of 2°41 per 1000 among the insane 
population of males. Even if the corrected deathrate among the insane for 
phthisis were ten times its magnitude among the sane, we fail to understand 
what Dr Newsholme means when he asserts that: “the segregation of each 
tuberculous lunatic has been equivalent to the withdrawal of ten ordinary tuber- 
culous persons” (p. 274). Because tuberculosis among lunatics is ten times as 
frequent—judging by deaths, and accepting for the purpose of argument Dr News- 
holme’s figures—why should the isolation of one tuberculous lunatic be equivalent 
to the withdrawal of ten sane tuberculous persons? That must suppose a tuber- 
culous lunatic capable of spreading ten times the infection of a tuberculous but 
sane individual. All Dr Newsholme could say would be that from the standpoint 
of segregation it is ten times more desirable to segregate any lunatic, than 
any sane person, for the former is ten times as likely to die of tuberculosis. 
Dr Newsholme brings no evidence to show that the individual tuberculous 
lunatic is ten times as dangerous as the individual tuberculous sane person. 
As a matter of fact we still need very careful investigation of the relation of 
lunacy to tuberculosis, not only having regard to some forms of tuberculosis as 
possible sources of feeble-mindedness, if not of insanity, but also having regard 
to whether the old idea of asylum segregation as a possible cause of the spread 
of tuberculosis among lunatics is wholly erroneous, and we might further examine 
whether the new idea that the majority of tuberculous lunatics were tuberculous 
on admission is in its turn wholly sound*. In the present state of our know- 
ledge we think the assertion that the increased segregation of lunatics has 
substantial relation to the decrease in the phthisis deathrate is quite unproven. 


(4) Dr Newsholme’s third approximation to the segregation ratio is the 
index 100p;/p, where p; is the number of paupers in institutions and p, is 
the total number of paupers, indoor and outdoor. Unfortunately Dr Newsholme’s 
usage does not agree with his definition. The index he appears to use is generally 
100p,/p;, and the values of this are given in the last column of Table LXV 
(p. 277) and Table LXVII (p. 279). In Table LXVI (on p. 277), however, the 
100 factor is dropped and p;/p, again used in the heading to the central column, 


* Many lunatics enter and re-enter asylums, it does not follow because they died of tuberculosis 
and were tuberculous on last admission that their tuberculosis was there on first admission. 


ee en 
———— 





ee 


— 





ALicE LEE 535 


although the figures in that column appear to refer to 100p,/p;. Below this 
table occur the words: 


This experience for the entire series of individual years is expressed by a coefficient of 
correlation of — “94 between segregation measured by the fraction of pauper population treated 
in institutions and the phthisis deathrate. (p. 277.) 


The correlation to support Dr Newsholme’s views should be negative if 
100p;/p, 
has been used, and positive if 100p,/p; has been used. But as many of his 
other correlations are given with the wrong sign, it is difficult to discover what 
measure of segregation he actually has used. ‘Io add to the confusion the index 
actually plotted is log p,/p;, and not 100p;/p,, which is what Dr Newsholme 


defines as his index. We have accordingly in our analysis of the figures, to be 
given later, used both indices 100p;/p, and 100p,/p;. 


It is very difficult to appreciate how the ratio 100p;/p, can effectively measure 
the segregat vu ratio—it is indeed impossible to agree with Dr Newsholme’s view 
that any of his indices “ measure with approximate accuracy the ratio which states 
how many of total days of tuberculous sickness are passed in institutions.” 


The policy of compelling as many paupers as possible to go into the workhouse 
was directly adopted with a view to diminishing the total pauperism as well as 
abuses connected with outdoor relief, and that policy is the source of increase in 
the index 100p;/p,. Had Dr Newsholme examined his own Tables LXV, LXVII 
and LXIX carefully, he would have seen that the percentage of indoor paupers 
on the general population has remained almost constant for the period in question, 
while the total paupers per cent. of the general population in England with Wales 
and in Scotland have decreased. If the same relative number of paupers are segre- 
gated now as formerly, how can this segregation have diminished the chances of 
infection in the community? We can hardly assume that all paupers are tuber- 
culous, or markedly so relatively to other men, so that the reduction of the number 
of outside paupers by indoor segregation is equivalent practically to a reduction 
pro tanto (note the extraordinarily high correlations !) of the number of tuberculous 
in the community. If so, then the reduction of the tuberculous deathrate would 
be due not to the segregation, but to the large decrease in the total pauperism 
relative to the population of this country. The correlation, as we shall demon- 
strate, is not between the segregation of paupers and the phthisis deathrate, but 
between the diminution of total pauperism and the phthisis deathrate. We shall 
investigate how far this relationship between total pauperism and the phthisis 
deathrate is “organic,” i.e. continues after the annulment of the time-factor, or is 
purely due to the fact that both pauperism and phthisis have diminished during 
the forty-year period under consideration. 


It was this third definition of a segregation ratio in conjunction with the 
fourth segregation ratio to be considered later that led us to realise that the whole 








536 Tuberculosis and Segregation 


problem must be dealt with afresh, and the modern methods of partial. correlation 
and variate difference correlation applied to its various aspects. We have taken 
the period used by Dr Newsholme, 1866-1903 inclusive, and have used the figures 
for each individual year thus obtaining 38 entries, which are few indeed, but the 
best we can probably do with data of this kind, and therefore directly comparable 
with Dr Newsholme’s results, for he seems to have used individual years for his 
correlations although he does avt always say so (cf. pp. 271 and 280), and notwith- 
standing that his tables are all given for five-year periods. 


The population numbers for England and Wales (Table A) were taken from 
the Registrar-General’s Annual Report for 1909, and the phthisis deaths from the 
Reports for 1866-1903; the average of each five years’ period agrees with Dr 
Newsholme’s values for phthisis, but the values for indoor and for total paupers 
do not quite agree with his. Dr Newsholme was therefore written to and asked 
whence he obtained his numbers. He was kind enough to reply, but said that 
he was unable to refer at the moment to the original tables, but that undoubtedly 
the data were the statistics given in the Annual Reports of the Registrars-General 
for England, Scotland and Ireland. We then examined the Local Government 
Board returns and found that Dr Newsholme apparently had used the pauper 
returns for the January quarter of each year. We kept therefore to the Registrar- 
General’s Report, as the numbers there given are based on the Local Government 
Board’s returns for the whole year, which are a fairer measure of pauperism than 
those for the January quarter alone. 


For Scotland, our numbers (Table A) agree with Dr Newsholme’s for both 
phthisis and indoor paupers, except when we take the first five-year period 
(1866-70), where they differ slightly. In the case of total paupers for the periods 
1866-70, 1881-85, and 1896-1900 our figures do not agree*. We cannot find 
any reason for these divergences except a slip in his or our arithmetic, or the 
possibility that a wrong number of outside paupers has been taken by one or other 
of us. We do not think the differences in the values are such as to invalidate 
a comparison of results. 


In Ireland the only serious discrepancy in our values is in the total number 
of paupers for the period 1876-80. 


These discrepancies, however, emphasise the very necessary rules for statistical 
treatment: (i) that the ultimate raw data should be published with every inquiry, 
and (ii) it should be stated exactly where they are taken from, and how they have 
been treated. 


Table A gives our raw data, Table B our deathrates and indices based thereon. 


We have correlated the phthisis deathrate taken as 10°¢/P with 100p,/p, and 


* We are unable to compare his and our data for individual years, because Dr Newsholme has only 
published his data for five-year periods. 


et 


tt 


Auice LEE 537 


100p,/p;. Taking first England and Wales, and calling these three indices 
respectively [y, J; and J,, we find: 


Correlation of Ig and I; = —°9664 + 0072, 
Ig and I, = +9298 +0148. 


Dr Newsholme gives — ‘94 as the coefficient of correlation “between segregation 
measured by the fraction of pauper population treated in institutions and the 
phthisis deathrate” (p. 277). Having regard to his confusion of J; and J, and 
his frequent interchange of the signs of correlation coefficients, we can only say 
our results confirm his high numerical value, but not his actual figure. 


But does this actual figure mean that there is any real relationship between 
segregation and the phthisis deathrate? To test this, we replaced the index J; by 
I,, where 


Mean number of indoor paupers per 10° for the population, 1866-1903 
10° x p,/P 





es 
at 3) (5). 
00 (F) /(55 
In this index the relative number of indoor paupers is assumed to remain 
absolutely constant. We found: 


Correlation for England and Wales of Js and J; = — ‘9459 + ‘0115, 


that is to say we get substantially the same value, a value higher than 
Dr Newsholme’s, by putting the number of indoor paupers relative to the 
general population constant throughout the period. It is very difficult, in the face 
of such a result, to suppose that segregation of paupers has anything whatever 
to do with the diminution of the phthisis deathrate. It is clearly due to a 


negative correlation of a high magnitude between Per and ¢/P, or to a positive 
correlation between 5 and 4 ie. to a correlation between a high total pauper 


rate and a high phthisis deathrate. Dr Newsholme’s result merely reduces to 
the statement that total pauperism in England and Wales has diminished con- 
temporaneously with phthisis. If the result has nothing to do with segregation, 
can we assert that the reduction of phthisis is causally related to the reduction in 
total pauperism ? 


Overlooking for a moment a new objection to be raised later, let us apply the 
variate difference method to the correlation of ¢/P with 100p;/p, and 100p,/p; 
in the cases of England with Wales, of Scotland, and of Ireland; also to the 
correlation of ¢/P with the index 100 (p;/P)/(p,/P) in the case of England with 
Wales. The following are the results : 





538 Tuberculosis and Segregation 











TABLE I. 
ie England with Wales Scotland Treiand 
| 
| - a - 
Correlation of 10°¢/P = 10°¢/P 10°¢/P 
with ak I; I; I, I; I; I, 


| i ieee | 








10°9/P | 10°9/P | 10°g/P 
a 


| 
| Crude Indices | — 946 + ‘012 = “966 + 007 | + ‘930+ °015 | — 952+ °010 |+ "920 + °017 | — *881 + 024 | + °893 + °022 | 











A + °090 + *134 | — 258+ °126 | + 340+ *120 | — -265 + -126 |+ -250+ °127 | — 2804-125 | + 2354-128 
A, — 201 +149 | — -461+4°123 We 542+ °110 | — 2404-147 | 41824151 | - -264+4°145 I+: 180+ °151 | 
As — 835 + °153 | — 508+ *127 | 4-567 + *116 | —-205 + -164 | + -086 + *170 | — -226 + “163 | + -162 + “167 
Ay — 407 +°155 | —-518+° 136 [+ ‘547 + °130 | — +1864 +179 | + 024 + *185 | — 182+ -179 | + “133+ “182 
A; — 475 +153 |—-5284°143 | +-529+ +142 | — -182+-191 |—-003+°198 |— -145 + -194 |+-108+ “195 
Ae —*538 + °149 | — 5434-147 |4°5834°151 es as — +112 + -206 | +-081 + -208 
A; |— 5844145 | — 562 +°150 | + 539+ °156 — an sa |+ 044+ °219 


As — 614+ °143 | —°5874°151 | +°557+°159 _ = _ | — 004 + -230 | 





It will be seen from this table that whether we use the index J; or its 
inverse J,, we get practically the same results—naturally with changed sign. 
But the results themselves are of extraordinary interest. For both Scotland 
and Ireland, when we proceed to annul the time-factor by correlating successive 
differences, we find that the high correlations interpreted by Dr Newsholme as 
marking a relation between pauper segregation and phthisis deathrate entirely 
disappear or become less than their probable errors. There is thus no organic 
relation between these variates as measured by the above indices. In the case 
of England and Wales, however, while there is a reduction on annulment of the 
time-factor to roughly two-thirds of the high value noted by Dr Newsholme, 
this value does not tend to disappear with increasing differences. Thus in 
England with Wales, as apart from the remainder of Great Britain, there would 
at first sight appear to be an organic relation between segregation of paupers 
and the phthisis deathrate. But our first column under the England with Wales 
section shows that if we fix the percentage in the general population of these 
indoor paupers and then annul the time-factor, we reach a slightly higher value 
of this apparent organic relation. It has therefore nothing to do with segregation. 
Thus Dr Newsholme’s interpretation of his original high correlations appears in 
every case fallacious. 

There are two methods of testing this result, ie. the absence of organic 
relationship between indoor pauperism and phthisis. Suppose we correlate the 
crude numbers of phthisis deaths per annum and of indoor paupers per annum, 
the resulting coefficient will have very small logical value because both these 
variates are continuously changing with the time*. But now suppose we annul 

* It is noteworthy that the England with Wales and the Scotland correlation coefficients for these 
erude variates are high and negative, but for Ireland the coefficient is moderate and positive. Thus 
the factors at work must be totally different in the two Islands, Since indoor paupers relative to 


the population have remained singularly constant the increase of phthisis deaths must have been much 
slower than the population increase in Great Britain, but somewhat faster in Ireland. 





ct 


— 


en 


ALiIcE LEE 539 


the time-factor by correlating the differences of these variates, then we shall free 
ourselves from the influence of. the time-variate, and in doing this we shall also 
free ourselves practically from the influence of change of population, which is a 
time change. 
The following table resulted from this investigation. 
TABLE II. 


Correlation of Crude Phthisis Deaths (¢) and Indoor Paupers (p;). 











Variates | England with Wales Scotland | Ireland 

Crude — 9344-014 | —'718 +053 +°457 +086 
A; — '376+°116 — ‘206 +°130 — 092 +°134 
As — °302+°141 —°219+°148 | —°103+°154 
A; —°213+°164 —*180+°166 | — 143+°168 
Ay —100+'183 | — +157 +°181 | —-147+-181 
As — 016+°198 —°158+°193 | — 140+°194 














It will be seen that for all three countries, whether we start with the positive 
correlation of the Irish or the negative correlation of the English and Scottish 
returns, there is no remaining significant correlation after annulment of the time- 
factor between indoor pauperism and phthisis. 


A second method of verifying our conclusions is to find the partial correlation 
between indoor pauperism and phthisis deaths for a constant value of the total 
population and a constant value of total pauperism. We thus ask the question 
whether with a constant population and a constant amount of total pauperism, 
an increase of indoor pauperism would organically affect the number of deaths 
from phthisis. By making the population and the total pauperism constant we 
are largely producing an annulment of the time-factor and ascertaining whether 
a change in the number of indoor paupers due to causes other than temporal 
influences the number of deaths from phthisis. 


The system of correlation coefficients given in Table III, p. 540, was determined : 
Here the values of p, 7,4 for England with Wales and for Scotland confirm 
T u 


the conclusions we have reached by other methods, ie. there is no significant 
relationship at all between phthisis and indoor pauperism. The value for Ireland 
is, perhaps, significant, but having regard to its smallness (—‘3 +°1) and the size 
of its probable error, no one can lay real stress on it, in opposition to the results 
of the other two countries. In general the coefficients for the Irish data appear 
very anomalous, and certainly divergent from those for Great Britain. 

Thus our investigation of the relation between indoor pauperism and phthisis 
appears to be entirely opposed to Dr Newsholme’s conclusions, We find the 
segregation of paupers to have no substantial influence on deaths from phthisis. 
The one outstanding point at present, the relation between p,/P and ¢/P after 


Biometrika x 69 








540 Tuberculosis and Segregation 


annulment of the time-factor (see our p. 538), has no bearing on the segregation 
problem of Dr Newsholme. 
TABLE IIL 


Total and Partial Correlation Coefficients of Crude Numbers of Indoor and 

















| Coefficients | England with Wales Scotland Ireland | 
ie! : Payot eaten jee | 
Ty — 934+ °014 | — 718+ 053 +°457 + 087 | 
v | | 
- +955 +010 | +831 +034 +°763 + 046 
| u 
. — ‘950 ++ — 896 + 022 479 +°0 
Total | Top 950+ O11 896 + 022 +°479 +084 | 
Coefficients oo | — ‘544+ °077 | — ‘528 + ‘079 — 251+°103 | 
| fr 
| ty, | +5774-073 +°780 +043 +:070+'109 
Tt | | 
| ri. — 674+ 060 — 805 + 038 — 684 +058 
Ra Doi? SPRe ein SWS abi Shee Wee 
| Ppp | — ‘287 +°100 +°111+4°108 +°162 +°107 | 
: | , |  —"905+-020 — ‘575 +073 4924-083 | 
Partial | P,’ v;o | pick «ct | + a? 
Coefficients | : . role a i 
| ae | — 189 +°106 — ‘017 +°109 — 305 + 099 
Pp, pi | mm 











To approach nearer to the meaning of the relation between total pauperism 
and phthisis we determined the correlation between p, and ¢@ for constant P, and 
found 

ppg = — 277 +101, 
which is barely significant having regard to its probable error. 


Now after elimination of the time-factor, we found for the correlation of $/P 
and J, at the eighth difference —614 +143, but this is the same as the corre- 


, +r. . youl 
lation of PP and ¢/P. Hence the correlation of p,/P and ¢/P must be very 


significant, positive and of the order ‘6. Now if p, and ¢ after the removal of the 
time-factor were practically independent of each other, there would be a high 
positive correlation between p,/P and ¢/P, due to the fact that P when it 
takes—after annulment of the time-factor—any random deviation appeers in 
both variates’ denominators. In other words, we are inclined to believe that 
the high negative correlation between $/P and J, is solely due to spurious 
correlation arising from the nature of the indices used. 

To throw still more light on the matter we have investigated the correlation 
between the total number of paupers and the total number of deaths from phthisis 
when the time-factor approaches annulment. It will be seen from the table 
below that for both Scotland and Ireland there is finally no relationship at all 


———— 


between phthisis deaths and total pauperism. 


AuicE LEE 


541 


On the other hand, England with 


Wales is tending to a value at least approaching to the crude correlation. We 


have therefore this noteworthy result: England with Wales starts with a con- 
siderable value and concludes with an equally great value, Scotland starts witb 
a high value and ends with a zero value, Ireland starts with an insignificant value 


TABLE IV. 


Relation between Total Pauperism (p,) and Deaths from Phthisis (). 


England with Wales | 


| 
| Crude values 


+°577 +073 


| 


Scotland 


+°780+ 043 


Ireland 


+°070+°109 


| 


A, — 095 +134 +°025 +°135 +°164+-132 
A, 4+1744°151 +025 +°156 +°144+°152 
As +°286+°158 | +4012+4-172 +°1314°169 | 
Ay 4+°347+°163 | +4:033+4-185 +°1104°183 | 
: in +°4134°164 |  +4-0274°198 +090 +196 








and ends with an insignificant value. If pauperism were causative of phthisis, it 
is hard to believe that this would not manifest itself in the Scottish and Irish 
returns; these negative any such hypothesis. It would appear that there are 
essential differences in the treatment of pauperism in the three countries. I 
suggest, but I cannot demonstrate the view, that phthisis itself leads to pauperism 
in England, i.e. that the relatives of the phthisical breadwinner more often are 
allowed to become paupers in England than in the sister countries. In other 
words, that the only organic relationship between pauperism and phthisis we have 
been able to discover may be due to a relatively harsh treatment in England of 
the dependents of the phthisical. 

To show how effectively the variate difference correlation method removes 
time influence, we may note that we correlated total population (P) with total 
pauperism (p,) and total phthisis deaths (¢) with total population by this method, 
with a view to ascertaining whether the relation between p, and ¢@ would be 
modified, if we determined it for constant population. 

The following results were reached : 


TABLE IV". England with Wales. 
| Total Population 


| | and 
| Total Pauperism 


Total Population | 
n 


a 
Phthisis Deaths 





(P and p;) (P and ¢) 
Crude values — 674+ 060 — ‘950+ ‘011 
a) +°457 + ‘107 — 039 + °135 
A, — 016+ °156 — *205 + 1149 
43 — °022+°171 — ‘089+ °170 
1 — 031+ °185 + 002 + *185 


69—2 








542 Tuberculosis and Segregation 


Thus we see that apart from the time-factor there is no relation whatever 
between either pauperism or phthisis and population. In the relation between 
total pauperism and phthisis deaths, no further correction for population is 
needful than that obtained by the annulment of the time-factor as in Table IV. 
Table IV bis shows us that neither pauperism nor phthisis is organically related 
to population, although we might well have anticipated that greater density of 
population would influence pauperism and provide greater chances of infection, 
and so of deaths, in the case of phthisis. 


(5) We now come to Dr Newsholme’s fourth and last measure of segregation. 
It is “the ratio in which the number of paupers treated in workhouses and work- 
house infirmaries stand to the total number of deaths in the community ” (p. 276). 
In our notation this is p;/d, or as an index 100p;/¢. But in the figures actually 
given in Table LXV (p. 277), and headed Segregation Ratio, Dr Newsholme 
appears to be using 100¢/p;. The same remark applies to Tables LX VIII and 
LXIX (pp. 280—281). Thus it is difficult to be certain of what Dr Newsholme 
intends to be taken as his fourth measure of segregation. In our discussion below 
we have used both 100¢/p; and 100p;/@ to provide for both contingencies and 
to check our results. 


Unfortunately Dr Newsholme makes little attempt to justify either his third 
or fourth ratio as an approximate measure of segregation. It will be remembered 
that he has defined the true method of measuring segregation to consist in 
forming the ratio “stating how many of the total days of sickness (number of 
patients and number of days of sickness) are passed in institutions” (p. 267). In 
this fourth index of segregation he replaces phthisical patients in institutions by 
indoor paupers, and total of phthisical patients by total deaths from phthisis, 
dropping any question of the number of days of sickness. At the very least this 
seems to involve two assumptions, (a) either that all indoor paupers are phthisical 
or that for the period in question the proportion of indoor paupers who are 
phthisical has remained constant, (b) that for the period in question the number 
of deaths from phthisis has remained a constant fraction of the total number of 
cases of phthisis. It is difficult to see how, without such assumptions, such figures 
can “measure with approximate accuracy the ratio which states how many of the 
total days of tuberculous sickness are passed in institutions” (p. 267). 


Yet in another paragraph Dr Newsholme quotes with apparent approval the 
statement of Mr Fleming, who speaks of the “great change in the character of 
workhouse inmates during recent years,...The able-bodied inmates are gone and 
the sick inmates have come” (p. 273). Such a statement is absolutely inconsistent 
with the assumption (a) above. 


To justify (b) we must assert that for the last fifty years of the nineteenth 
century there has been no change in efficiency of treatment in the case of tuber- 
culosis, for without this we cannot assume that deaths from phthisis are even an 
approximate measure of the number of cases (p. 267). The fact that the reduction 


AticE LEE 543 


in the phthisis deathrate has been substantially different for the different age 
groups, and is especially marked in the case of children, seems to indicate that 
recovery, at least from puerile phthisis, is more frequent now than formerly. 
However, not to spend more time on these assumptions—which, it appears to us 
that Dr Newsholme has by no means justified—let us examine whither this fourth 
method of approximately measuring segregation leads us. Table V gives the 
necessary coefficients. 


TABLE V. 








Correlation and Difference Correlations of 10°p/P and 100p;/p or 100¢/p;. 














Le England with Wales | Scotland Ireland 
Variate | 
10°¢/P \— es ee San gae eek A ee Cher ata TE is 45 A ye 
vith 
om 100p;/¢ 100¢/p; | 100p,/p | 1009/p; 100p;/¢ 100¢/p; 

ap ETE ley rey es WERT! eh feng. = 3K 
| Crude |—-760+ °046 |4+-976+ -005 — 861+ °028 | + °944+ °012 |—°712+ 054 | + "666 + 061 
ma” — 868 + ‘033 | + °848 + °038 | — °755 + °058 | 4+ °772 + 055 |— °819+ 045 | + -707 + 068 
| Ag |—*879+°035 |+ °875+ -037 | — 824+ -050 |+ °834+ 047 |— ‘922 + -023 | + °755 + 067 
| Ag | — 895 + “034 | + 874+ 041 | — “809+ -059 | + 824+ 055 |— 954+ 015 | + 791 + 064 
| Ag |—'895+°037 |+ -860+ -048 | — 811+ 064 |+ °805 + 065 |— -964+ 013 | + 805 + 065 
| As | — 898+ 038 |+ "847 + ‘056 | —°786 + ‘076 |+ °788 + ‘075 |— 970+ 012 \+ "831+ ‘061 
an — ‘907 + ‘037 | + °850 + ‘058 | — °788 + 079 | + °794+ 077 | — °973 + °011 | + 848+ -059 
| A; |—*917+-035 | + °835 + 067 | — 792 + 082 | + °791 + -082 | — \+ “857 + °056* 


Now this table at any rate demonstrates a very high correlation between $/P 
and p;/¢, while the previous table for Dr Newsholme’s third approximate segre- 
gation ratio led in the case of England with Wales to the value —‘587, and in 
the case of Scotland and Ireland to negligibly small values! Dr Newsholme 
himself writes: “Any of these indirect forms of segregation ratio has therefore 
to be verified wherever possible by the application to the same community and 
period of one or more other forms of the ratio, and checked where practicable 
by a special examination of sample constituent communities whose figures are 
included in the total. This has been done so far as the information obtainable 
allowed. It will be seen that the results obtained by applying different ratios to 
the experience of the same country and period are usually, though not invariably, 
in good agreement ” (p. 268). 


What is quite clear from the above results is that, while in the case of 
Dr Newsholme’s two chief measures of segregation, there is very sensible difference 
in the case of England with Wales, there is an absolute discordance in the cases of 
both Scotland and Ireland. Accordingly on the basis of his own axiom, that we 
must check our results by application of one or more other forms of the ratio, 


* This correlation continues to rise until it reaches ‘929 with the thirteenth difference, but with such 
high differences the “population ” is so reduced that the method ceases really to be reliable. 








544 Tuberculosis and Segregation 


we are bound to reject these ratios as even approximate measures of segre- 
gation*, 

But it would not be satisfactory to leave the matter here and not provide some 
explanation of why this fourth segregation ratio, both before and after the annul- 
ment of the time-factor, leads to such high correlations. Luckily the matter is 
capable of a perfectly straightforward and obvious explanation, which would have 
been anticipated had Dr Newsholme had in mind the danger of “spurious cor- 
relation.” 


What he is correlating are essentially ¢/P and p;/d. The latter may be 
written (p,/P)/(¢/P). Now pj/P is practically constant during the period in 
question. Hence Dr Newsholme is correlating ¢/P with 1/(¢/P), or a variate 
with its reciprocal. In other words we may anticipate something very closely 
approaching perfect correlation. The deviation from such correlation arises from 
the fact that p;/P is not absolutely steady, although its variations are very probably 
nearly random. The assertion therefore that this fourth measure of segregation 
assists in demonstrating the close relation between the fall in the phthisis death- 
rate and institutional segregation is based on a fallacy which entirely overlooks 
“ spurious correlation.” 


It will be seen therefore that not one of Dr Newsholme’s methods of reaching 
an approximate measure of the segregation is satisfactory, and they lead to con- 
tradictory and inconclusive results. Whether there is any really substantial 
relation between the prevalence of phthisis and institutional segregation we 
do not yet know. All we can say is that Dr Newsholme has entirely failed to 
demonstrate it, if it actually exists. 


(6) Before concluding this paper it may be of interest to judge how far it 
justifies the application of the method of variate difference correlation to such 
problems as are here dealt with. 


In the first place, the correlations of successive differences should approach 
steady values. This is generally—as the reader can judge by examining Tables i, 
II, IV and V—but not invariably, the case. The test cannot, however, be com- 
pleted, as the method ought not to be pressed to such high differences that the 
order of the difference is a large percentage of the original “ population.” 

We doubt whether it is advisable to carry differences beyond the 8th in a 
population of 38. 20°/, to 25°/, reduction in the population is as much surely as 
it is safe to allow where the original population is so small in number. It is true 
that a population of 38 itself is capable of exciting the derision of trained 


* Under the circumstances it is, perhaps, unnecessary to draw attention to Dr Newsholme’s state- 
ment that “the specific result of pauper segregation must have been lower in Ireland than in England or 
Scotland” (p. 282). Free of the time-factor the correlations of phthisis deathrate and Dr Newsholme’s 
fourth segregation ratio are higher in Ireland than in England or Scotland. This criticism as well as 


Dr Newsholme’s original remark are of no importance, because the fourth segregation ratio correlation 
is entirely spurious, 


AuiceE LEE 545 


statisticians, and ought never to be used where hard work can produce larger 
numbers. But in annual returns, as has been indicated by others, a period of 
30 to 50 years is often the maximum attainable, and we must take what we can 
get. In the present case the probable errors of the difference correlations—based 
on the Andersonian formulae for steady conditions—show us that we can form 
fairly legitimate conclusions from the results reached. 


A second test that we have applied is the approach to the theoretical values 
in the function o°5 ,/o°5 .¢ where 8,2 is the mth difference of the variate «. 


The following table shows that there is a reasonable approach tc these 
theoretical values in the calculated standard deviations of the differences, and 
suffices to justify the application of the variate difference method within the 
limits of practical statistics. We have continued the differences beyond the 
values used in some of the correlation results to indicate the sort of irregularities 
which may be expected to occur when using high differences in small populations. 
Terminal irregularities then begin to affect the uniform rise of o*5 2/0"5 


m—1% * 








Tuberculosis and Segregation 


546 






























































(PUvyesy (PUdTPOYIS ‘SAID YR ~punjpbuy 
‘V UTaAViL 
foie | | Pe 
F09-€ = i ms =~ LZ8-€ my = <5 = <i £06-€ | 6998 | — 6LL-€ 198-6 | 77 | 
| | 
899-€ ee ot = abd OL6-E | = — = basa —_ 0€6-€ Lhe | — €F8-€ 98-6 | &T | 
OPL-€ _ _- aa — | 106€ —- | - - _ _ eee-€ | see | — LL8-€ €e8- | aT | 
| 
9GL-€ at = _~ = €68-€ sai VA a = =p ~~ G68-€ ssee | — 188-€ 8I8-€ IT | 
| | 
GZ9-€ — a = ae Re a fia ai — ms — 0Z8-€ wLe | — F98-€ 008-€ OF | 
| | | | | | | 
86F-€ = —_ | C98-€ — | — — = a OLL-E | ¢89-€ {— OF8-€ SLL-€ 6 | 
| | | | | 
CEE-€ = =< = ata (RB | Zions — a aie = - 1 GEES | sec-e | — GO8-€ OSL-€ 8 
GLE-€ 889-€ C6G-€ 980-6 +| — | O88 —_ I6L-€ | 88¢-€ | et ee | 469-€ | P9S-€ | 0€9-€ 6EL-€ PIL-€ 4 
see | roe | Ihe | 999-€ — | sece | ore | 969-€ | IFF-€ | oe | — 909-€ | zec-e | cuce | pice | Loe | 9 
| | | 
6C6-G 86E-€ GEE-E L69-€ | SIP-€ C1G-€ | G6E-€ 8eg.€ 1€€-€ | OCF-€ | EPL-€ I&h-€ | GOS | OGF-€ LPG-€ 009-€ G 
| | | } 
} | 
OF8-% 060-€ [80-€ ILP-€ | 628-€ | F98-% | €8Z-€ 90¢-¢ | ZLe | L0e€ | 129-€ | OSL | sere | OE cele | 00ce | 7 
| | | | | | 
GZ9-Z 90¢.Z 818-6 LVG-E | GIL] | €6P-% b90-€ | 208-6 OOL-Z 1L0-€ IIl-€ | T6L-é 808-€ | 908-6 | 8262 86-8 | & | 
| { | | 
ce. | 689-1 | Fe6-1 | 699-3 | GIRL | 861-1 | O@@e | ZI8-1T | 0891 | 6002 I8L-T | 9091 | 666-2 | G@z-e | L186 | 0008 | é 
| | | 
8L0- SSI. 9L0- GLO- 8€0- | 910- LE9- LGI- | 9F0- 9€0- ogo. | 610- BLP. | 9¢c0- | 220. 000-3 | [ 
| | 
oo et —__——_— — —_ —_—_— = _ - — - —E — -———- ——— |-_— 
| | | 
SeTeM | SseTeAA SOTBAL | SOTBAA SeTVM 
puvely | puvpjoog pue pusypery | puvpyjoog | pure puseriy | puvyjoog pus puvpery | puspyjoog | pue purvjeray | puvpjoog pus | | 
puesuq | puvysug | puepsaq pussug | purlsuq 
| sereg | wu 
ps —— | ————_—__—_——_——— —— [vorjo100Uf, | 
*d/P00T :vooxdt00xy *d/+dgot : 1e001dyooxy | $/*d00T | +d/*doot | d/%c01 | 
OnBYy UoTyeZeI1Z0g qZANOWT O1j8y UOTBSeIZEg psy, | ony uoneZeZeg y3IN0,7 O14BY UONBVseIZeg pA, eyBI-Ywog] puw sIsIyIyT } 
ae Me eres | LB ky a aN SED le al | 
Um = y 
-——p 09 yooouddy ayy pun **"8.0/*"8.0 fo sonvg ‘TA WIAVL 


G 





547 


Aticrk LEE 





O6FOOT 
ecOLol 
9666 
CZ666 
OFOTOL 
S6LOLT 
CPolé6 
98896 
LPZ6E 
61966 
LEL66 
9TE1OL 
GLZPOL 
O060¢0T 
9EE90T 
€LP6OL 
6FOZLT 
COZFZT 
OF9SOT 
OLZ901 
9OLILT 
FOS6OT 
S9cFLL 
PZ9TIL 
FSZI6 
ZOZPS 
¢€cosl 
cOPSL 
O¢c09L 
Te6LL 
GO9LL 
LoteL 
€0¢0L 
€stol 
€600L 
C6ILTL 
Oe6L9 
610Z9 


siodneg 
T8307, 


Jo Jequinyy | JO xequinN 
| 





EZCEr SS9ElPP 69°96 €Z88s8 ITLL EZZ6LEP 9899 PEOLiIL 
ZEleP FLEZEPP OOF6 66698 } c9sol 66Z1LESP 9199 6F8ZOL 
98ESP O€9CTPF 6F°6 ZEPSS 90E0T OSSESPrFr 0989 LLE169 
Z881P LOGS9FrPr 9LO01 0¢scs 8986 SC69EPP c9gel ESzZLL9 
LE9EF LOPZOCF OSt6 69678 C966 O€SO6EF Geel 9Z8169 
OSTFP SLPSLSP 8L96 L?698 } ZE00L 68CPPEP 66ZL 9GLIZL 
6PSZF LIG6ZSP SPL6 [ZZL8 8066 ZE L66EP c6EL OP99TL 
O66ILF L90ZPCP [S06 €9098 6996 ESlPosPr Z90L LOLLIL 
6ILIP 9E66SCP 89L6 LIEts €806 CP9IGOSP O€8l LZ69EL 
E9GZPF O9Z68¢EP 9296 OZEZS 2126 909C9LP 8hZL SZPrLOL 
9ESLP ZOPLOOP 6986 ZOOTS PZ88 6Z02Z LF LOOL eoF069 
ZSclPp BO8EEo SPOOL FLOO8 LESS OL68LOP £989 998199 
PPLIP 9LEO89P 8E00l PoFOS O9Ts CFZ9EOP LOFL LE TE9g9 
600€8P 6S6LILF 9LI0OL POPES Z818 ZELSOOP CLOFS9 
9FOPF CseLel 99LFE8 LLt8 COEELEE 90ZLTL 
SE9cPr ZLELOSP P6E9S | £988 LOLEPG6E ercl|el 
CEPOP 61ITLOSF b8698 | C606 SI€Pl6Ee | 9806ZL 
6PFLOF C68C06P F690L OLO8S C6T6 eci¢gsse | 660ZEL 
600LP S8oSe6P 60L0L Z9€E98 L006 *LOE9S8E | SELLOL 
Iresr | loser | escol es «| «6 TFOG «| «(BLbLe8E | 116669 
cseog LI8€Z0¢ GLLOT 8 ZZL8 L96G86LE | OF6OLL 
Z9ZTS SLOLOLS 8oZOL P2906 F968 LEQOLLE OLE9TL 
8e0Eee OLLEFIG LZ00T 8°66 OFO6 FOSCPLE | LO¢GrEL 
OL9FS 8F9ZOZE FZLLT 680¢6 9626 FE66COLE | 98Z0EL 
»&906E CZ9C9EE 61901 SE1c6 } OFI6 | €PrPreoggg ELLGEL 
*90F8P QFZZSSE 90F01 C6LIG6 €9L8 89ZRZIE S0FE89 
S88FP O8E98ZE eerol O86 9FO8 EZOO6SE 888999 
9E6EP FPELLES Z600L E896 98ecL ESleccEe ZSLOL9 
608¢P 6Z98LES OPZOL 89886 EL9L FRLFLCE SFIFIL 
669LF 6LE686Z¢E OLF6 S8S9SOL 69LL POLLLFE CEELFL 
16ELP SE6LZES IPEOL PE96OL 8682 9SOLFFE | OSeS6L 
€S09F OB8ZLES O&FOL 9CE9TI ZOFL S6LPOFE | 6868 LOZZSS 
F6O09F 6LIS6ES LZFOT 6062Z I 6FLL 1zesoee | 996 900876 
CLI6P ZLE8 tre 6866 6EZ9ET 8Zz6L LOLOSEE || Etre LOSSL6 
SIL@¢ FE06FFS €866 £000€ L 9FES C8scoeEe | LZ88 L6PF96 
PSPFS PI6S9FS OFPFE6 9FPOET F6L8 OGESLzEE SLL8 P9696 
LFETE¢ 60¢98F¢S E8eol T9ERZT 6Z6L S60CFZE LZ88 C6Z1Z6 
OLSO¢ EPE6ZEEE 6LIOL ZPOIET OLOL GETSIZE | 6088 Z6OL98 
| 
te is x Re ee Le 
srodneg SISIUIY srodneg siodneg sIsIqiyg siodneg 
i1oopuy uolsindog | wor syyveq [810], | goopuy uoreindog | ur syyveq [810], 
jo aequiny | yo azequiny | jo zoquiny jo Joquiny J jo z0quinyn 
“punjary “punjqooy 








"8061 910Joq sj4oday yonuup UI LOZ9E8E 
































‘VY UTAVL 








L6Z0ZZ 
CSILIZ 
PLSGO%S 
CPr666I 
LFOF0Z 
C98F0Z 
€lP66L 
61Z00G 
086002 
LLOSE6T 
69F06I 
OLPELI 
CICFLI 
CEEsLi 
69028 L 
669¢8T 
e99Esl 
6F9I8I 
LOOGLI 
989LLI 
SELCLT 
PPFOLT 
FLILLI 
98ZZLT 
CI6F9L 
902Z9¢T 
L6Is8PI 
LIG881 
PoOrEl 
FOOLET 





9688F 

[SZ6FL 
9EL8ET 
66°0ET 


siodneg 
toopuy 
jo Jequinyy 








SEESsleee 
9ZOLBESE 
E9ELZ9ZE 
L8L6FZZE 
SOETS8TE 
GELLIGTE 
CPrescLlEé 
89820808 
829 1Sh0E 
LOZFOLOE 
GP809L6Z 
C6ELCFES 
61898064 
ELYEDLSST 
6ESSPF8ZT 
8SZ9ETSS 
9OLLZBLZ 
CESEEGLZ 
YOLOZELZ 
CE6LZZH9T 
6P69Z99Z 
GVEFEEIS 
GV LOFO9] 
S8ZFILGS 
68FILECST 
6SZEECOLS’ 
BECH669PS 
LOZOLEVS 
CSECPOFS 
PESFELES 
9SCROPEST 
C6P960ES 
PE6C88LZE 
9LETOSZS 
66EEZEZE 
ELLEP6IS 
CECLLOTS 
PS96GOF IG 


uorendog 





ZELOF 
LL90F 
PSIF 
L866F 
sorer | 
ceelh | 
GrOIr | 
LSZ0F 





C1S9P 
99E8P 
SELPP 
SPZrP 
SE6rP 
GL8LP 
GLISP 
SCE6P 
€900¢ 
CIL8P 
IPGLY 
lOZ8F 
ZLELG 
9¢86¢G 
ecers 
GLLIS 
EP6EG 
6LE6P 





ecel¢ } 


68969 
9LEES 
LEGPS 
OLZES 
eerg 
EPrOSE 
VILSG 


SIsIyyd 
uloIy 8qyBEqT 
jo roquinyy 





‘sap Yym punbusy 


£061 
6061 
LO6I 
O06T 
6681 
8681 
L68T 
9681 
C681 
F68T 
£681 
Z68T 
I681 
O68T 
6881 
88st 
L881 
988T 
Cssl 
Psst 
€ssl 
E881 
1881 
OssT 
6L8T 
BL8I 
LL8T 
9L8I 
GL8T 
PL8I 

SLST 
GL81 
TL8I 
OL8T 
698T 
8981 





70 


Biometrika x 


























































4 och | 
GS - 
“4 6cP 14a | ae GLI | LG orl 1.08 | 
PPP | si ss F9I L.08 6F9 
ab o1F ae Ll ee eS OFT 0-08 GIG O81 e061 
oF FZ II ad L9 €S1 1.6 €Z1 ZO6I 
09F 2 : ver | Gt | gor pie 667 0 921 1061 
. Lop az rs > RD ee ieee ae ii cor 60 | BB eel 0061 
OF | az as Sel > a aS | ISP | . 
bP a | we PEL oh 891 $86 96F | 16 €e1 6681 
CP GE $e | ae | GLI QZ ms 0G Tél 868T 
L&D GZ 9€1 tl 9¢ rh 6LP IZ FEI 
4 eer ee hice’ eee 6-28 L6F 0z L681 
a : &Z ae 4: 9g 1G - = O&T 9681 
PoP ad LGL 6L PLI 6 oa | ELF IZ OFL C68 
Iv rIP #Z 921 3) on LG Or | IZ ge : 
OF 9IP #2 FZ 09 ot. 9-12 LEP 82 . PT FEST 
4 8h | ~ ESE Se San ae 6-98 re z oe Seet 
PZ | | 16 | peal PZ 4 
> IP eb ez 901 | on P81 £:92 cle fe pd Z681 
. 2G | e C6 | vat ne aaG | L6 * 
S ar POF 2 gII alae hese 1-96 698 IZ po 168T 
GS ORT v4 ac 89T 
, ORE ies LIP : 81 8! 9.96 pS a Sadia 068T 
| 8g oe 6 8 | m= 9LT FSS Zz a LST 6881 
> | ch roid &% oo l 18 Ss ZG | ped #G LST SgsI 
| Lev vA | el | ~ dead OF v4 | on, 
S | cP Ccr ss | FIL i | pe 8-FG | @? > | a at L881 
a St COZ a aG | 96 
D 4 Lov IZ en re = ee Gee “ch pont 
LP . 009 0% 711) an | ‘gee pn a 42 8% €sI oo 
S oF Teg 6L Ill 06 FIZ ro Ice | 62 S81 ee 
Ss 67 16F a a | :..8 608 9-¥6 ze 06| 8 CsI ao 
ie a4 o | gy a ae 1&Z bajo SLE 1B €sI oe 
2 Lg GLY IZ 80L €6 1&Z 9-6 Loe 8z 191 [881 | 
% 8¢ 1g? €% LOL 66 Antes ¥-GE CEE ig 4 O88T | 
S in : 3G . 6&6 1.ZS : , | G0G 6L8 
SS 8g 9 Zz FE , AGC 4 " } L8T 
= | IEP &G ) LOI 6&2 o oe v6 rE | I1Z gl 
S 08 LPP GE L8 FIL CFZ ee 686 ce | 806 gees 
S| 19 90¢ 0z [08 eet | 8 L-06 99s le min L281 
~ 19 C = | ar } £96 6-81 Oz ra GIG 9L8T | 
sS Scr ZZ 96 FOI potent Foz 6g ote 
3 £9 SEF Gale ee | 888 F-81 et . 066 =| SLST | 
y | = : LoL | OF ree re 306 FLSI | 
99 | 8% eC | 88 | get | Foz FLT 693 Le OS | Peat 
L : * | -< 9% £91 | om | a 616 =| €L8T | 
F C6P 0Z | €8 IZI | QIZ ms | F9Z | r9 1ZZ | op 
PL EG | F8 a | coat LSI 69% le > GL8I | 
61 Il €8Z Z-C A FES ! 
or 9Lg mi 4 ¥6 | 901 19% On: | ee sh oe } Peet 
gL . mn | | L9G F.C] LE ItZ OLST | 
a €1¢ 0Z : | Oot } OOL | Or . C8 cE oe pes 
18 681 ‘ 896 9.¢1 | “ bed GES | 6981 | 
00¢ 0@ > Sada llRS It | aye i 062 Fe ae: 
€8T ELE > € PES 200 
ae es eee el eee 08 CZI PLE cal G&G OF 508 pies 
wii Pies eae ie } Ps ac cl PLZ | rs a0 ue | 
sredneg | %!stuI4d ae + re ib =] | 08S |:998T | 
Rp WOT} SY} BA srodneg | uoyendog | ssodneg | S!!434d : . = ——— = ae el 
g10dne 20T red} IOOPUT cOT | 043 JOgOTAed | [BgOU, - | WOay SYyVAQ | siodneg | uonendo siedne |} SIstygt | leat y. -2 
d d [@0 J, .OT 10d 1}89q [ndog vedneg | _SISld 
fora) |  «<oopuy ssodneg ee: SIstaId SISIqyg srodneg | ,0T zed | LOOPUT ZOT | 24350 <OT 10d] [0 -OT 10d | wos sqyBeq | saedneq | uonerndog 
=H | jo Jequinyy rtoopuy .' SqPV | Wor; SqY}PVOT 1Oopuy szedneg | red sista d sIsIqyyd siadne I | OT z0d IOOPUT ZOT | 943JO OT tod | 
re jo roquiny | 7° 7AM | yo zoquy | jo zoquiny | _ 700PUT mo syywag | Wo syZveq]  I0OpuT sradneg aod sistyjqgq| sistyyyq | vox 
Ae DASE 22h nat SRR | “ | jo roquiny | jo raqminy | jo requiny | jo zoquiny | _ doopuy wos syyBeq | MOI sqIBAQ_ | . 
-punjaay ——— pe cel a BO Jes N | yo zoquiny | 9° AEN | 5° ZqIDN | 
“punyjo0og CaS gm i? toe? ae Se 
‘sayoy yyun punjbug 


‘d ATAVL 





a3 THE INFLUENCE OF ISOLATION ON THE 
DIPHTHERIA ATTACK- AND DEATH-RATES. 


2 Us By ETHEL M. ELDERTON, Galton Research Fellow 
pe SP AND KARL PEARSON, F.R.S. 
ces (1) Introductory. The problem of the advantages of isolation, not only in the 
51 GY 69 case of diphtheria but of other diseases of an infectious character, is likely, owing 
to modern views as to “carriers” and other sources of transmission, to be much 
enerens discussed in the near future. It is therefore well to consider what may be learnt 
from the statistics available. The questions which naturally arise are of the 
BE following kind: 
(i) In districts with a maximum of isolation is there a minimum of incidence? 
(ii) In districts with a maximum of isolation is there a minimum deathrate 
Sah from the disease isolated ? 

There cannot be the slightest doubt that, if these two questions were answered in 
the affirmative and we could show that the incidence was markedly less and the 
deathrate significantly smaller in districts where isolation was most stringently 

3 + carried out, then these results would be advanced as a strong argument in favour 
of isolation. 

os a To the trained statistician, however, no conclusion based upon such results 

" without much further analysis would have any validity. To illustrate this point, 

SSS let us consider the hypothetical case that medical or popular opinion in a given 


town has been persistently in favour of increasing the isolation-rate, and further 
neal suppose that in this district improved economic conditions have increased the 

immunity, or bettered sanitation lowered the incidence, while at the same time 
Sa3 new methods of treatment have lowered the deathrate of the disease; it will be 
clear that in considering the statistical results over a course of years we should 
find a high isolation-rate negatively correlated with both the incidence- and the 
death-rates. Thus if we considered this correlational as a causal nexus, we 
should be raising an apparently strong argument in favour of a maximum of 
isolation, which would be based on the statistical fallacy, that when two quantities 
are both changing continuously with the time, this must of itself denote a causal 
relation. 


70—2 














550 A Study of the Effects of Diphtheria Tsolation 


In precisely the same way a positive correlation between the isolation rate and 
the attack- or death-rates by no means justifies us in asserting that isolation is 
worse than non-effective. It is conceivable that in the period or the district 
under consideration with an increasing isolation-rate there might be decreased 
immunity in the population, greater virulence of the disease, or even a limit 
to the available isolation accommodation, so that in the case of attacks of an 
epidemic nature the isolation rate would not increase proportionately to the cases, 
or indeed might even diminish*. Further, if apart from the changes in a single 
district, we consider a great variety of districts, it may chance that the greatest 
isolation-rate occurs in those districts where the disease has been found most 
prevalent, because it appeared the most obvious remedy, and thus a greater attack- 
or death-rate would be no real measure of the futility of high isolation. 


If, however, it should turn out that on the whole the higher isolation rate is 
associated with the bigher attack-rate or the higher death-rate then it will be clear 
(i) that there is ground for demanding a closer investigation as to the advantages 
of isolation, and (ii) that we may be overlooking the real method, or at least one 
or more important factors, of the transmission of the disease. It is conceivable 
that isolation of all cases during attack may be of far less importance than isola- 
tion of certain special cases for a shorter or longer period well subsequent to the 
attack, and after they would normally have resumed their ordinary avocations*. 


The main problems which arise are accordingly these : 


(i) Have isolation-, attack- and death-rates changed continuously with the 
time, and are the apparent correlations really suggestive of causal relationships ? 


(ii) Are associations between isolation-rate and attack- and death-rates really 
spurious arising from the fact that where the attack- and death-rates have been 
severe there the remedy which appeared nearest to hand was more isolation ? 


* For example, if there were only 100 hospital beds available, and out of 100 cases 50 were sent to 
hospital, the isolation-rate would be 50 °/,; but if in the next year there were 300 cases and all the beds 
were used, the isolation-rate would be only 33 °/,._ Thus limited accommodation may tend to produce 
a negative correlation between isolation-rate and attack-rate, so that a positive correlation between these 
two rates may be of more importance than its apparent significance. It is extremely probable that 
some of the falls in isolation-rates are really due to an increase of incidence, so that the same 
percentage of cases cannot be met by the available hospital accommodation. 

+ It is, on the hypothesis of natural selection, a plausible view that the parasites—including under 
this term all disease organisms—which ultimately survive must tend to become innocuous to their 
hosts, and thus the decreasing virulence of certain diseases may be accounted for. The organism is 
destroyed owing to the death of the host or its own death at his recovery, or it has been modified by 
selection so as to become innocuous to its host relative to his immunity. But immunity is a matter of 
personal equation, and thus the function of the “‘ carrier” in preserving and spreading a conceivably less 
nocuous form of the organism becomes clearer. We are not unaware of the view that the organism 
remains the same, but that the immunity is increased owing to ‘‘ practise” of the leucocytes, but such a 
view requires the assumption of inheritance of acquired characters to explain reduced disease virulence, 
and further compels us to assume two types of immunity, the one which destroys the organism, and 
the other which without modifying it, establishes so to speak a mutual modus vivendi. 


the 
2 


lly 


ben 


it to 
beds 
luce 
hese 
that 
ame 


nder 
heir 
m is 
1 by 
er of 
less 
lism 
cha 
nce, 
and 


ETHEL M. ELDERTON AND Kart PEARSON 551 


(iii) Are the districts which have adopted most isolation really urban 
districts where isolation was easiest to adopt and where possibly economic or 
social conditions favoured the spread of the disease or, in the case of the death- 
rate, the disease encountered a less resistant population ? 


(iv) What evidence is there to show that the districts which have rapidly 
increased their isolation-rates have subsequently lower attack- or death-rates ? 





If no one of these problems can be fully answered,—even in the case of a single 
disease—with the data at present available, at least light can be thrown on the lines 
which their solution in the future must take; and further something can be done 
to prevent hasty generalisation and excessive dogmatism as to the advantages or 
disadvantages of the isolation system. It can never be too strongly insisted upon, 
because it is so often forgotten, that preventive medicine is essentially an 
experimental science, and that in nine cases out of ten the efficiency of any line 
uf action can only be adequately tested by statistics and by statistics collected after 
the expenditure of many thousand pounds, possibly spread over a long period of 
years, in carrying out this line of action *. 


(2) Material. In endeavouring to throw some light on the above problems we 
have fortunately received data of very considerable value from Dr E. H. Snell, the 
Medical Officer of Health for the City of Coventry. He obtained for a period of 
nine years, 1904-1912 inclusive, for about eighty towns or districts of large popula- 
tion but of very varying local conditions, (i) the annual number of diphtheria cases, 
(ii) the number removed to hospital, (iii) the number of deaths. We have added 
to this material the estimated population of the town or district, and further 
certain data as to the economic and social conditions. Unfortunately there is no 
existing adequate measure of the general sanitary condition of individual towns, 


” 


although the construction of a general sanitary “index number” would be of 


remarkable value in many forms of inquiry. We took as our measures of social 
condition : 


(a) Death-rate of infants under a year. 


(b) Amount of overcrowding, that is to say the percentage of the population 
in private families living more than two in a room. 


(c) Density of population, ie. the number of persons to the acre. 


* Assert that it is most desirable to test the effect of sanatoria and of tuberculin in cases of 
tuberculosis, but do not dogmatically proclaim them as “ cures” for phthisis, until statistics have been 
collected in sufficient amount and have been adequately and dispassionately examined to prove or 
disprove your statements. Insist on compulsory inoculation for enteric in the case of all recruits, but 
do not make it optional and then publish letters in the newspapers giving perfectly idle statistics, or 
go round to the camps giving popular lantern lectures to the recruits showing the gravestones of 
uninoculated persons, the portraits of persons dying of enteric, or much enlarged pictures of bacilli! 
If you think it experimentally worth doing, inoculate ; but don’t bring inoculation about by emphasising 
the dread of pain or the fear of death, both of which it is the first essential for a soldier wholly to 
disregard. 






























552 A Study of the Effects of Diphtheria Isolation 


(d) Economic prosperity as measured by the number of indoor and outdoor 
servants of both sexes per 100 private families. 

Our data are based on the census of 1911 as providing more ample information 
on these points. It will we think be admitted that the list of towns dealt with 
provides a very fair sample of the urban populations of this country. It ranges 
from manufacturing towns* like Preston, Rochdale and Bolton, mining and iron 
towns like Rhondda, Wigan and Middlesbrough, sea-ports like Hull, Liverpool 
and Southampton, to county towns like York and Reading, watering places like 
Brighton and Blackpool, suburban districts like Acton and Hornsey, and residential 
towns like Oxford or Bath. We ought from such a list to be able to throw some 
light on the relation of isolation to incidence under a variety of social conditions, 
if indeed these latter are factors in the problem at all. 

(3) What are the crude correlations between Isolation-Rate, Attack-Rate and 
Mortality-Rate? The isolation-rate (7) has been measured as the average per 
cent. of cases removed to hospital during a five or four year period. We have two 
such periods, the earlier period 1904-1908 and the later period 1909-1912. The 
attack-raie has been measured per 1000 of the population, uncorrected for age 
distribution. Since diphtheria is largely a disease of infancy and childhood this 
neglect of the age correction—the reduction to a standard population—may seem 
serious. But in the first place we had not the age incidence in the individual 
districts, and in the second place we satisfied ourselves that such correction, if it 
could have been made, would not substantially modify any argument we have 
based on our data. For we calculated the attack-rate (A’) on the population 
under 15 years of age, as well as the attack-rate (A) on the total population of the 
districts. We found the correlation between the two methods of measuring the 
attack-rate was +°972, which indicates how close is the relation between the two 
methods of measuring the attack-rate and how little influence small variations 
in the proportion of less immune persons in the population due to age differences 
could have on the results. 

The attack-rate (A) has been measured as the number of cases per 1000 of the 
population. The mortality-rate has been measured in two different ways; first as 
the population mortality, the death-rate in the ordinary sense (J/) or the deaths 
per 1000 of the population; and secondly the case death-rate or tle mortality (m) 
per 100 attacked. We now give the crude correlations between J and A. 
They are: 

First Period: 1904-1908, Try = + °427 + 068, 
Second Period: 1909-1912, Tr4 = +290 + 069. 

* See table, p. 567, for 76 of the 80 towns, the four others with full data only for the second period 
being Reading, Stoke, Dewsbury and Edmonton. 

+ The formula giving the juvenile attack-rate A’ in terms of the crude attack-rate A is: 

A’=1°3094 A + °0164 


ll 


with a probable error of +1369. 
Thirty-three towns were selected at random out of the 80, and gave the following results for A’ 
calculated from A and A’ as observed. The theoretical mean error=-162; the mean error of the defects 


































Erne, M. ELDERTON AND Kari PEARSON 553 


Thus both periods show significant if not very large correlation*. The difference 
(137 + 093) between the coefficients for the two periods is, however, probably 
not significant. Thus in towns with greater isolation-rate there is certainly a 
higher attack-rate, and equally certainly no argument can be based on the crude 
| figures to prove that the more the isolation the less prevalent is diphtheria. We 
| 


will now turn to the death-rate M, and we find: 

First Period: 1904-1908, Tr = +158 + 075, 

Second Period: 1909-1912, Try = — ‘012 + 075. 
In the first period isolation was associated with a higher diphtheria death- 
rate, in the second period with a lower diphtheria death-rate, but neither are 
of any real significance. Thus all we can conclude from the crude figures is 
that they show no evidence that isolation has reduced the general death-rate from 
diphtheria. 

We next take the case death-rate (m) and we have for the two periods: 

Hi First Period: 1904-1908, Tim = — 509 + 057, 
Yam = — 527 + 056, 


Second Period: 1909-1912, Tim = — ‘5384 + 054, 
ram = — "495 + ‘057. 





is —°153 and of the excesses+°134; this shows very fair accordance, 17 deviations being positive and 
16 negative. The greatest deviations occur in Hornsey, Bath and Brighton, where residential neighbour- 
hoods show fewer children, and in Edmonton, Walthamstow and Rhondda where there are probably 


| 


Caleu- | } | Caleu- | } 
| | Observed | Sate | A ‘eaeeer lated | A 
| 

| | } | 

| Derby a) oe 4:25 | —-17 ||Edmonton ...| 1:36 | 1°80 | 4°44 | 

| Southampton ...| 2°67 2°70 +03 || Bath OS me cy fe > — ‘31 

| Hornsey ..| 2°48 1°86 —°82 | Newport |. eae a +°1l 
Bristol eo | 2°33 2°23 —'10 || Rhondda > oe 1:10 | 1°34 +°24 

| Reading | #8 1:95 | —'18 || Bury | 108 | @ | —"17 

| Nottingham ... 2°09 1°99 —'10 || Rotherham ...| 1:06 | 1°18 +12 

Salford oan 2°04 2°16 +°'12 || Dewsbury Sen 1°01 1°01 — 00 

| Ilford 1:98 2°09 +°11 || Blackburn _... ‘99 “91 — 08 

| Brighton 1°94 1°68 — ‘26 || Manchester... “94 ‘97 +°03 

| Stockton 1:90 2°12 +°22 || Oxford x “89 "79 | -—-10 

| Ipswich =e 1°87 1°89 +°02 || Bolton | oe "85 - 01 
Grimsby 6 SS 1:90 +°05 || Rochdale ee 75 | —-07 

| Walthamstow ... 1°85 2°10 +°25 || Northampton...| 78 “76 — 02 

| Coventry 1°84 1°93 +°09 || Barnsley ee 75 88 | +°13 

| Plymouth ois] Cae 1°56 -*05 | Wigan ay 58 “64 +06 
Wakefield as 1°40 1°39 - 01 W. Bromwich... “AD 53 +°08 | 

| Smethwick ...| 1°38 1°55 +°17 | 















excess of children. On the whole the general order is very well maintained, and the general attack-rate 
closely fixes the juvenile attack-rate. In any further collection of material, it would of course be well 
‘to have the age-distribution of cases. 

* We endeavoured to see whether the correlation of isolation- and attack-rates would be modified if 
we took the attack-rate on children under 15 years. This made little difference, r being raised only 
from +°'290+-069 to +°315 + ‘068. 





554 A Study of the Effects of Diphtheria Isolation 


According to these correlations, when or where the isolation-rate is high, the 
case mortality is low. Further when or where the attack-rate is high the case 
mortality is low. Now we know that: 


I =100 x isolated cases + all cases, 
A =1000 x all cases + population, 
m=100 x deaths + all cases. 


Hence if we selected the number attacked at random and chose the deaths to 
be simply some number less than this, we should expect to find a considerable 
negative correlation between A and m; and as we actually do find such a corre- 
lation, we cannot be certain that the actually observed values of r,,, are not due 
to “spurious correlation.” If they were “organic” we should interpret them to 
mean that a widespread epidemic (A large) was a less virulent epidemic (m small 

On the other hand the spurious correlations of J and m would be positive in valus, 
while the actual correlations are negative. Thus it would seem that while a high 
isolation-rate is associated with high attack-rate, it must be “organically” assq- 
ciated with a lessened case mortality. In other words while isolation does not, On 
the crude figures, appear to lessen the frequency of disease, it does appear t 

lessen the mortality among the attacked. This result appeared to be of such 
very great importance, if thoroughly established, that we determined to inquire 
into it further. It seemed reasonable to believe that the bulk of persons Bean | 
might have better care in a hospital than in their own homes and thus isolatiom 
indirectly lessen the ill effects of the disease. 


We accordingly endeavoured to approach the problem from a somewhaf 
different standpoint: Given two districts with the same total number of person 
attacked (a), will that district with more isolated (7), have fewer or more deaths (ap 
The answer to this question depends on whether the partial correlation coefficient 
of total isolated cases with total deaths for constant number attacked is negativ@ 
or positive. We found: 


First Period Second Period 
Correlations 1904-1908 1909-1912 
7iq = Isolated Cases and Deaths uF +- 860 + ‘020 + °867 +°019 
Tia = Isolated Cases and Attacked ties + °937+°010 + °968 + ‘005 
Taq Attacked and Deaths as esa + ‘907+ °014 + °918+ ‘012 
wia= Isolated Cases and Deaths for con- ap i 
066 + °0 — *220 + ‘072 
stant number of Attacked + Cs Wr ates 





Thus in the first period for a given number of attacked more isolation wal 
associated with more deaths, and in the second period for a given number 
attacked, with fewer deaths; but in both periods, having regard to the probabl 
errors, We cannot assert any real significance, or be reasonably certain that where 
there is more isolation, there recovery is more likely to occur. 


We shall see later that the correlation between J and m for constant tota 
number of attacks is not the same thing as the correlation of the total isolated and 











Erne, M. Exvpertron anp Kari PEARSON 555 


total deaths for constant total number of attacks. And this divergence, often in a 
marked degree, of partial correlations for rates and for absolute numbers is not 
unfamiliar to those who have had to deal with disease statistics. In the present 
case it renders still more obscure any argument drawn in favour of isolation from 
apparently lesser case mortality. 


>? 


(4) On the degree to which “spurious correlation” may be influencing the 
attack- and death-rates. It seemed desirable if possible to throw further light on 
this point and accordingly we correlated attack- and death-rates with the total 
population. It will be remembered that: 

A = 1000 x cases + population, 

M = 1000 x deaths + population, 
and accordingly if A and M be correlated with the population P, we might 
anticipate that if cases and deaths had no relation to population, there would be 
a high negative correlation arising from A and M both varying inversely as P. 
We were comforted by finding practically insignificant positive correlations. Thus: 


First Period Second Period 

Correlations 1904-1908 1909-1912 
*p4= Population and Attack-rate +°137+°075 + "054+ 075 
?py = Population and Death-rate + 131+ 076 + 116+ 074 
?p, =Population and Isolation-rate +°152+ °075 +°102+°075 


The last correlation cofficients show us that there is very little relation between 
the size of a population and the amount of isolation practised. Further these 
isolation correlations in which there is no obvious source of spurious correlation 
are as significant as those of population with attack- and death-rates where the 
possibility of “spurious correlation” is manifest. We conclude accordingly that 
risk is more uniformly distributed over population than we had anticipated, and 
that the correlations between the three rates J, A and M are really open to 
“organic” interpretation. 

The next point which arises for discussion is whether the presence of the total 
number attacked (a) in the rates J and m can produce spurious correlation. If so 
we should anticipate that the absolute number a would be negatively correlated 
with both isolation and case mortality rates. We found: 


First Period Second Period 

Correlations 1904-1908 1909-1912 
“gz =Total attacked and Isolation-rate + °264+°072 + °226 + 072 
"am= Total attacked and Case-Mortality — 251+ °072 — °203 + 072 


The first set of these coefficients are not even negative and therefore cannot 
be due to “spurious correlation,” although such correlation may have reduced 
their organic values. They admit, however, of an easy interpretation, namely 
that: where the number attacked has been large the isolation has been more 
practised. The second set of coefficients might be due to spurious correlation, but 
they again admit of a simple interpretation as apart from “spurious correlation,” 
namely that: when the attacks are numerous the deaths are relatively few, 


Biometr‘ka x 7 














Deaths Registered. 





A Study of the Effects of Diphtheria Isolation 


because a wide-spread epidemic means a mild epidemic. All four coefficients are 
significant, and pair and pair they are quite consistent but in no case are they 
of any marked importance. They enable us, however, to correlate the isolation- 
rate and the case-mortality for a constant total attacked, i.e. to find the partial 
correlation ,?m,. We have the following results : 


First Period 
1904-1908 


Second Period 
1909-1912 

Correlation of Isclation-rate \ 

> { 

am=a"; a= and Case mortality for} 


aa constant number attacked 


— ‘474+ 056 — 512+ 057 


while we have already found : 
Correlation of number isolated ) 
aia= >, With number of deaths for } 


+066 +077 
( constant number attacked | 


— 220+ 072 


Correlation of Total Numbers Isolated and Total Registered Deaths. 


Total Numbers Isolated. 








| ~ ~ ~ ~ ~ ~ 
ate be Ss £218 |Site leteista 
4 % t~ ™ ™ ™~ ™ N Y » Y SS 85 
| + | | t | | | Totals 
. s ~ ~ ~ R | » | » VN | H 
o— 751 64 27 2 1 ee ors SS | 94 
V5—150 16 9 3 5 2 —_— 1 — — -j;—|— 36 
| 150—225 | l — 2 2 2 1 |— -- 9 
225—300 | — —_ 1 | — 2 l — - | — — 4 
| 300—375 — — 1 2 l l l 2 oa 6 
| s75—450] — | — | — | — 2 _ i I 5 
450—525 -- — — — o— l Sik “| sabes ae 1 
§25—600 — — aE a l Aa) SO, CRE = eas Era 1 
600—675 — — — — xe ee oy ea or ee 0 
| 676—750 | — = es ree ay RARE ew 9 Crees Wee) BE mh, jee 0 
750—825 oo — — — — =e a l 1 
Totals 81 37 6 7 10 6 4|2 l 0 1.0 | 2167 
j | —— —_ — — — 
Means 54°2 | 59°8 | 112°5 | 133°9 | 232°5 | 312°5 | 382°5 at 2125 isolated 103-4 
| 








It will therefore be clear that removing the variation in number attacked has 
made only slight reductions in the values of the correlation coefficients between 
isolation-rate and case-mortality. The discrepancy between the absolute numbers’ 
and the rates’ correlations is not to be accounted for by “spurious correlation ” 
involved in the use of total numbers attacked in both rates. It must therefore be 
due to: (i) lack of linearity in certain of the regressions, (ii) high values in the 
cvefficients of variation in certain of the quantities under discussion, or to a com- 
bination of these causes. With the small size of the populatiotis under discussion 
it is by no means easy to test the true linearity of the regressions, even if we do 
what appears legitimate in this case, namely pool our data for the earlier and 





tales 











ErHeL. M. ELDERTON AND Kari PEARSON 557 


later periods. Our actual correlations have all been found without grouping by 
the direct product-moment formyla, but we give on pp. 556 aud 558 two grouped 
| tables to illustrate the difficulties which arise in analysis. Our first table is for 
the total numbers isolated and the total deaths registered. It will be seen at once 
that the marginal distributions are intensely skew, crowding up into the corner of 
few deaths and few cases isolated, so that they appear to asymptote to the zero 
values of the coordinates. Further, Diagram I shows that the regression curve 
Total isolated. 


0 250 500 750 1000 1250 1500 1750 2000. 2250 2500 





Total deaths, 














Diagram I, 


of deaths on total number isolated is, if just sensibly, still not markedly skew. 
Turning to the actual numbers given by this table we have the following series 


of constants : 
Numbers Isolated (i) Registered Deaths (d) 


Mean a ea 7 =475°33 d =103742 

Standard Deviation esa a o, =571°25 oq=118°61 

Coefficient of Variation ... 33a v,= 1:20 Y= 1°15 
{(=8.D./Mean) 

Correlation Coefficient and Ratio q= '8348* + 0163 nai= 8564 + 


* Agrees reasonably well with the non-grouped values for the two separate periods. 
t Found by taking means of all 13 column-arrays. 























558 A Study of the Effects of Diphtheria Isolation 


Clearly these results are of much interest ; they show that the difference of » 
for deaths on isolation over 7 is not as great numerically as, perhaps, the graph 
suggests, but they indicate the markedly high values for the coefficients of 
variation. Now it is quite straightforward algebra to prove that 

aT ita, dla = aT i, a> 
provided we may neglect terms of the square and product order in v; and vg com- 
pared with unity, and this is perfectly legitimate when these coefficients of 
variation are, as is usual in anthropometric measurements, quite small quantities. 
But in the present case these quantities are greater than unity and their squares 
are not negligible as compared with unity, thus we need not be surprised at the 
marked inequality of 47a, aja* and 414+ found above. The values of the former 
show a marked relation between the case mortality and the isolation-rate, and the 
values of the latter indicate no appreciable betterment in the deaths due to 
increased isolation. Before we consider which of these coefficients gives us in the 
present case the better result as a guide to practical conduct, let us examine the 
correlation table for isolation-rate and case-mortality for the same 157 observations. 


Correlation of Isolation-Rate I and Case-Mortality m. 
Isolation- Rate. 














0—10 | 10—20 | 20—30 30—40 , 40—50 | 50—60 | 60—70 | 7O—80 | 80—90 Totals 
| | | le 
| } 

Bes © l - |/—] = l ae at 5 8 28 
be eer Se F 2 — 3 3 4 8 11 9 5 45 
= 12-16 5 4 wi y 8 3 8 6 ¢ 1 44 
~ | 16—20 2 — 2 1 g 3 3 2 16 
= 20—24 9 — 2 1 2 a 2 1 — 17 
= | 24—28 3 — 1 | — on — - 6 
2 28-39 l — | — | — | — — — —_ 1 
(<~] t ' 
oe Totals 23 OI; 4 ll 1 | 12 | 26 38 | 2 16 157 

| | as 
! | } | | - € > 
Means }19-04| 14:00 | 16°18 | 15°60 | 14°00 | 11°08 | 11°71 11°09 9°25 13°26 








The following constants were found for this table: 


Isolation-Rate Case Mortality 
Mean a ee pom SS im =13°26 
Standard Deviation ae ae ory = 25°52 Tm = 5°58 
Coefficient of Variation ... ace vr = 08652 Um = 0°42 
Correlation Coefficient and Ratio ?Im= — 5291 + 038 Nmr= "5546 


The graph of the regression of case mortality on isolation-rate shows small 
evidence of skew regression (see Diagram II), and this is again confirmed by the 
difference between 1,,, and 9; being fairly small. The marginal frequency dis- 
tributions show, however, considerable skewness, and that for the isolation-rate is 
lumped up at the end where there is no isolation: more than half the numerator 
of 7», being contributed by the towns with little or no isolation. It is desirable 
to consider these towns further. ‘They have an attack-rate of 76, which is sensibly 
* This is the Fy 


om of our p. 556. + The values are given on our p. 554. 














Erne, M. ELpERTON AND Kar. PEARSON 559 


less than the mean attack-rate (1°30), but they have a case-mortality of 19°04 as 
against the average case-mortalify of 13°26; the 17 towns* with no isolation at 
all give a case-mortality of 194, It would thus appear that the towns with little 
or no isolation are those with a lower average attack-rate, but with rare exceptions 
their case-mortality is high. 

Isolation-rate. 


+ 10 20 30 40 50 60 70 80 90 100 











Case-mortality. 
— 
oo 
os 











Dracram II. 


To test the influence of these towns with little or no isolation, we have 
removed the column 0—10 isolation-rate group and recalculated r,,, and 7,7; we 
find 

Tim = — 4120 + 0484, ,,, = 4810. 


Thus while the correlations are somewhat reduced by excluding the towns with 
little or no isolation there is still in the towns which do isolate a very sensible 
relation between the degree of isolation and the case-mortality, and this relation 
exhibits rather more skewness. 


We may sum up as follows: The relation between greater isolation and a 
lessened case-mortality appears to be a real one. We have shown that it is hardly 
due to spurious correlation, as this would have produced a positive correlation and 
further no great changes are made when we correct for inequality in the numbers 


* South Shields (lst and 2nd Periods), Sunderland (1st and 2nd Periods), Barrow (1st Period), 
Preston (Ist Period), Wigan (1st Period), Smethwick (1st and 2nd Periods), Walsall (1st and 2nd 
Periods), West Bromwich (1st and 2nd Periods), Coventry (1st and 2nd Periods), Barnsley (1st and 2nd 
Periods). Of these towns West Bromwich in the 1st period had the highest case-mortality recorded of 
any of our 80 towns, while Smethwick in both periods, and Coventry and Barnsley in the 2nd period 
with no isolation had case-mortalities below the general average. 











560 A Study of the Effects of Diphtheria Isolation 


attacked. The regression is roughly linear and only very partially due to the high 
case-mortality in towns with no isolation. It is probable that where there is 
a large amount of isolation, the care of patients falls largely into the hands of 
a few men with a more extensive experience of the disease, and that this reduces 
the case-mortality. 


Against this may be set the fact that the correlations between the absolute 
numbers of deaths and of cases isolated for constant numbers attacked are in- 
significant. The divergence between the two methods of approaching the problem 
is, however, explicable because the coefficients of variation of the absolute numbers 
are greater than unity, and the identity of the correlations reached by the two 
methods depends on the neglect of the squares and products of the coefficients 
of variation compared to unity. It may be asked: Why in this case we prefer 
the partial correlation found from the rates to that found from the absolute 
numbers? We reply: Because the partial correlation coefficient for the absolute 
numbers depends on very high total correlations, and if these correlations be, as 
we have shown, non-linear, then the partial correlation coefficient not only loses 
its full meaning, but may, as experience has shown us, easily change its sign as 
well as its magnitude. We would suggest that in a-minor sense total mortalities 
and total isolations are bound to give “restricted tables,’ for deaths and isolated 
cases are perforce less than the numbers attacked, and that in such “restricted” 
tables, there is a general tendency to skey ‘correlation and to a spurious factor*. 
On the other hand it is true that case-mortalities and isolation-rates cannot 
exceed 100°/, or fall short of 0°/,, but these limits are the same for every array 
and do not vary from array to array as in the previous case. On the whole we 
think it safe to say that isolation is associated with greater prevalence of the 
disease and with a lessened case-mortality. 

(5) Js there any significant Relation between Isolation-Rate and General 
Diphtheria Death-Rate? We have seen (p. 553) that insignificant correlations exist 
between J and WM, and it is difficult to understand how a spurious factor could 
have modified this result. In the first place the small values of rp, and rpy on 
p. 555 show us that the value of pr, is sensibly the same as r,,; thus, for a con- 
stant population there is no sensible association between diphtheria mortality and 
isolation. But now let us ask whether for a constant attack-rate, isolation does 
not lessen general diphtheria mortality. We have: 


Correlation First Period Second Period 
7,4 =Isolation-rate and Death-rate ... +°1532 - 0119 
7,4 =Isolation-rate and Attack-rate ... +4268 +°2905 
?y4= Death-rate and Attack-rate eat +6772 +6879 


Hence 
#m =Isolation-rate and Death-rate for) 


-— 9 ‘O7 — "205 ‘O68 
constant Attack-rate J 044074 305 + 068 


* See especially the illustrations of such ‘‘ restricted” tables and their regression lines in a paper 
by Waite on Finger-Prints: Biometrika, Vol. x, pp. 421—478. 














Erne, M. EvpErToN AND KarL PEARSON 561 


Both of these values may be considered significant and negative, and hence 
when the attack-rate is constant there is a sensible, if not very close relationship 
between increased isolation and reduced general mortality from diphtheria. 

This confirms the view already reached that while isolation is associated with 
higher attack-rate its effect is to lessen the number of deaths whether they be 
reckoned as case-mortality or general population death-rate. 


(6) What is the meaning of the Association between Isolation and increasing 
prevalency of Diphtheria? The analysis of this problem is more complicated. 
The obvious answer of those who advocate increasing isolation would be that 
it has been adopted in those districts where the disease is most prevalent, and 
this of course may turn out to be correct. But we may ask in turn upon what 
statistics they depend to demonstrate their view that isolation lessens the preval- 
ence of the disease and is therefore advantageous, if our data demonstrate that 
where there is more isolation, there there is more diphtheria? It can only be by 
an analysis of no simple character that it is possible to deduce from such data 
that the practice of isolation has lessened the amount of the disease. 

There is, however, a preliminary problem to be dealt with. The isolation-ratie 
has been increasing very sensibly from 1904 to 1912, the attack-rate has lessened 
although very slightly, the case-mortality has lessened and the mortality on the 
population is considerably less. These facts are exhibited in the following table: 


Means Standard Deviations 
Variate Symbol 
1904-1908 | 1909-1912 | 1904-1908 1909-1912 | 
Attack-rate per 1000 population A 1°33 1°28 657 639 
| Isolation-rate per 100 attacked I 42°4 55:7 25°52 25°18 
Mortality per 1000 population M 174 ‘138 080 061 
Mortality per 100 attacked m 14°6 12°1 5°72 5°01 


Now it may well be, since the attack-rate has changed so little, that in the 
towns with increasing attack-rate there .has been increasing isolation, both 
quantities changing with the time, but having no causal relation the one to the 
other. It is of some interest therefore to consider the type of districts in which 
isolation is most practised. In the first place we ask if any known bad social 
conditions are associated with prevalence of diphtheria. We took as our measure 
of sanitary conditions (i) the infant death-rate, or the deaths of children under one 
year per 1000 births, (ii) overcrowding, or the percentage of the population in 
private families with more than two ina room. We found the following results: 


First Period Second Period 
Variates Correlated 1904-1908 1909-1912 
Attack-rate and Infant Death-rate — ‘206 + °074 — *206 + 072 


Attack-rate and Overcrowding ... — 153+ ‘075 — "136+ 074 














562 A Study of the Effects of Diphtheria Isolation 


These are not very considerable, but they are consistent, and indicate, as 
far as they go, that the incidence of diphtheria is not dependent upon such 
measures as the above of unfavourable sanitary conditions. 


If we now turn to the correlation between the mortality-rate on the population 
and these measures of unfavourable sanitary conditions we find: 


First Period Second Period 


Variates Correlated 1904-1908 1909-1912 
Death-rate from Diphtheria and Infant Death-rate + 081 + ‘076 +°118+ °074 
Death-rate from Diphtheria and Overcrowding ... + 061+ ‘079 +°004 + ‘075 


All these are indeed positive, but they are of no significance and if they were 
significant would be so small as to be of no importance. The first indeed might 
have been anticipated to show a higher value, for a certain number of deaths from 
diphtheria must be deaths of infants. We can only conclude that as far as these 
measures of unsanitary conditions are concerned they do not in any way determine 
the diphtheria death-rate. 


We now turn to the isolation-rate and find: 
First Period Second Period 


Variates Correlated 1904-1908 1909-1912 
Isolation-rate and Infant Deathrate ... — ‘414+ 064 - 875+ °065 
Isolation-rate and Overcrowding be — ‘236+ °073 — ‘235+ ‘071 


These are significant although not very large and we conclude that most 
isolation is practised in those districts which have the lowest infant deathrate and 
the least overcrowding; the correlations are sensible if not very large. In other 
words the towns with better health conditions have adopted more extensively the 
practice of isolating diphtheria cases. 


It seemed further of interest to determine: (i) whether diphtheria and isolation 
were more or less associated with urban conditions, and we took for this purpose 
the number of persons per acre, and (ii) whether the well-to-do character of the 
district, as measured by the number of domestic servants, indoor and outdoor, 
male and female per 100 private families, bas any influence on the incidence of 
mortality from, or the isolation of diphtheria. We found: 


First Period Second Period 

Variates Correlated 1904-1908 1999-1912 
Persons per Acre and Attack-rate ... +°165+ °075 + °043 + 075 
Persons per Acre and Death-rate es +°169+ ‘075 +°*115+ ‘074 
Persons per Acre and Isolation-rate ... + °073 + ‘076 + 053 + ‘075 


Not one of these correlations is of any importance, if indeed any of them can be 
considered significant. It is thus clear that the intensity of urban conditions has 
very littie to do with the prevalence of diphtheria, for if anything the suburban 
conditions have the lesser death-rate; clearly isolation has no sensible relation 
to number of persons per acre. 














Erne, M. Evperton anp Karu PEARSON 563 


Turning now to our measure of the prosperity of the district, we find that it 
has no influence on the attack-rate, that it sensibly, but not very intensely affects 
the mortality, the higher death-rate occurring in the poorer districts, and that 
isolation is associated quite significantly with the prosperity of the district, i.e. the 
more well-to-do the district the more isolation is practised *. 


First Period Second Period 


Variates Correlated 1904-1908 1909-1912 
Number of Domestic Servants and Attack-rate ... +°095 + ‘076 + 024+ °075 
Number of Domestic Servants and Death-rate ... — *219+ ‘073 — *308 + -068 
Number of Domestic Servants and Isolation-rate + °437 + 062 + °363 + °065 


We conclude therefore that the more prosperous and generally healthier 
districts are associated with fuller isolation, and that the more prosperous, but 
not necessarily the more healthy districts, have the less diphtheria death-rate. 
On the other hand the incidence of the disease seems independent of the prosperity 
or density of population of the district and to be somewhat greater in those towns 
where the sanitary conditions as judged by infant death-rate and overcrowding are 
bettert. 


Thus as far as our measures go, we must conclude that diphtheria is not to be 
considered as a disease of markedly urban districts, of overcrowded or of insanitary 
districts}. It would appear that the more prosperous and healthy districts have 
the greater isolation and that these are subject to somewhat the greater incidence. 


* Of course this may largely mean that the more prosperous towns introduce isolation to remove 
the supposed danger of infection when servants of the families of the well-to-do are attacked. 

+ In order to ascertain whether the variates persons per acre (p,) and overcrowding (O) were merely 
measures of the size of the town population (P) we correlated P with p, and with O and found: 


pp = +'404+-064 (1904-8), = + “402 + -063 (1909-12), 
a 
Tpo= +°091 + -076 (1904-8), = + 074+ -075 (1909-12). 
Thus overcrowding has no relation to the size of the town, the larger towns do not show more over- 
crowding. There is, however, a considerable association of persons per acre with total population, 
the larger towns having more persons per acre without exhibiting any more significant overcrowding. 
Making the population constant we find: 


First Period 1904-1908 Second Period 1909-1912 


Total Correlation | Partial Correlation | Total Correlation | Partial Correlation 
Pl 4p = +°0232°075 | 


4g = — 167 £:075 ro = —*136+-074 


| 

4p = +1654 °075 | pray, = +°122 4-076 P4y, = +1043 £ 075 
pao = —°140+-074 | 
| 
| 


| | 
| 49 = — "1534-075 | 
| PL es: NO ‘ 


Thus correcting for population only makes the relation between persons per acre and incidence still 
more insignificant, while the relation between incidence and less overcrowding becomes slightly greater, 
without rising to any real importance. 

t This result must not offhand be extended to subdistricts of our towns, it is an inter-urban and not 
intra-urban statement. 


Biometrika x 72 














564 A Study of the Effects of Diphtheria Isolation 


It will be seen at once that this conclusion opens up new problems: (i) Is the 
greater isolation the outcome of greater incidence, the only remedy suggestable 
for greater incidence being a more complete isolation? (ii) Is the greater 
incidence in some manner a result of the greater isolation, and does it really tell 
against isolation as a remedy against the spread of diphtheria? The association 
of greater isolation with local prosperity would then be merely a measure of the 
economic capacity of the district for carrying out the accepted sanitary code. 
(iii) If (ii) is to be answered in the negative, then is there any factor in 
prosperity which makes for a greater diphtheritic incidence? The final answers 
to these problems can probably not be given on the basis of the present data. 
The correlations under discussion although significant are not of such a marked 
character as to provide more than provisional statements, or indeed more than 
suggestions for further inquiry and tabulation. 


(7) Does greater Isolation follow increasing Incidence, or greater Incidence 
follow increasing Isolation? The problem is a much more subtle one than appears 
at first sight. What we have established is that those towns with the higher 
isolation-rate have the higher attack-rate. It does not follow from this that the 
individual town which increases its isolation-rate will increase its attack-rate. 
To determine whether this is so we took as our variates: increase in isolation- 
rate (I) between the periods 1904-8 and 1909-12*, and the similar increase A of 
the attack-rate. We found: 

rj¢= +256 + 072, 
a value probably significant, although not quite so large as that found for the 
inter-urban relation : 
r4,= + °427 + 063 (for 1904-8), 
= + ‘290 + ‘069 (for 1909-12). 


We cana, we think, therefore conclude that the towns which increase their isolation- 
rate are those with increasing attack-rate, just as the towns with higher isolation- 
rate are those with higher attack-rate. 


But this does not answer the question as to which is “the cart” and which 
“the horse”! Does the increased attack-rate precede or follow the increased 
isolation-rate? To answer this question we divided our material into three 
periods each of three years, let us say 7,, 7, and 7,. Then the attack-rate 
increase between 7’, and 7, was correlated with the isolation-rate in 7, and the 
isolation-rate increase between 7, and 7, with the attack rate in 7. In other 
words we asked whether towns with most rapid increase of attack-rate in the 


* That is the total number treated in hospital x 100 and divided by the total number of attacks was 
taken for the first period and for the second period, and their difference (second period—first period) 
was treated as increase in isolation-rate. In the same way the sum of the totals attacked for the years 
of the first period x 1000 and divided by the sum of the calculated intercensal populations for the same 
years was treated as the attack-rate, and the difference of second and first period values taken as the 
increase in the attack-rate. 

















Erne, M. ELpDERTON AND Kari PEARSON 565 


early periods had most isolation in the later period, and whether the ‘towns with 
most rapid increase of isolation-rate in the earlier periods had most incidence in 
the later period. We found: 

rj, ,1,=— 004 + 077, 

Tj, A, = + 085 + “077. 


Thus there is no significant relation whatever between either increase of attack- 
rate or increase of isolation-rate in the first periods, and the isolation or the 
incidence in the following period. 

As criticism of this result it might, perhaps, be suggested that the correlation 
of A,_, and J, will be influenced by what has been the course of J in the periods 
T, and T, and the nature of A in T,; we have accordingly, in order to test this, 
made the isolation-rate constant in the first two periods and the attack-rate 
constant in the third period and find 


This A," A, ae + 147 + ‘076. 


This is still of no real significance, although the sign appears to indicate that 
where the isolation-rate has been constant then increasing attack-rate in the 
earlier period is followed by very slightly more isolation in the third period, even 
if the attack-rate in that period be itself constant. 

Similarly we determined : 

Ault hb 4,7 O77 +077. 


This coefficient shows that towns which have increased their isolation-rate during 
a period of constant attack are not liable to sensibly heavier attack in the 
following period. 

It would thus seem that our first two problems are both to be answered in the 
negative. Towns which increase their isolation are not those which in the fol- 
lowing period have most incidence, nor are those which have increasing incidence 
markedly those of most isolation in the following period. Attack and isolation 
appear to have no causative relation, and the association we have found between 
more isolation and more incidence seems to be contemporaneous rather than 
successional, We are, it seems, compelled to search for something in the environ- 
ment, which favours incidence and at the same time isolation. The only common 

. factor that we have been able to reach at present is the prosperity and general 
healthy condition of the town. Under these circumstances there appears to be 
economic possibility of greater isolation, but why should there be greater incidence ? 
Is it possible that in the more prosperous towns there is greater consumption of 
some easily contaminated commodities, which may act as carriers of the disease, 
or more concourse of those of susceptible ages at places of public amusement or 
instruction ? 


(8) Test of the “organic” nature of the correlation of Isolation- and Attack- 
rates by the method of Variate Differences. If the suggestion made at the end 
of the last section be correct we should anticipate that by the use of the method 
72—2 








566 A Study of the Effects of Diphtheria Isolation 


of variate differences we should free ourselves from the influence of the time 
factor, if attack-rate and isolation-rate increase simultaneously in the more 
prosperous towns, but without organic association. We have nine years’ returns, 
but the epidemic nature of diphtheria in many cases does not give one great 
confidence in applying the method of variate differences to individual years. 
We considered that it would not be wise to deal with smaller intervals than 
three-year periods, and should have preferred had the data been available to work 
with five-year intervals. As it is, we cannot with thre>-year intervals for each 
town go beyond the second differences. We have accordingly 228 isolation-rates 
and attack-rates obtained from 76 towns for each of three three-year periods, 
152 first differences, and 76 second differences. We may symbolise them as J’ and 
A’, §,/’ and 6,A’ and 6,J’ and 6,4’. We found the following results: 


rg = +°332 +040, 
15,1'5,A' =-+ ‘236 + 052, 


',1'5,4’ = + 159 + 075. 


The first of these results compares reasonably with the previous results for the 
first and second periods on p. 552, i.e. 


1904-1908 : Try = +°427 + 063, 
1909-1912: Tr4 = +°290 + 069, 


with a mean value of +358. And this is the more true because the values of 
rz4 were found by the product moment method without grouping, while rz-4, was 


obtained from grouping in a correlation table*. 


Now the above values bring out very markedly that when we endeavour to 
remove the influence of the time factor and to obtain a purely organic relationship 
between J and A, we more than halve the correlation between them by proceeding 
to the second difference only. If we might suppose that a hyperbola would give 
the asymptotic value of 1; 775 4’ from the above three known correlations we should 
have 

7084 


Hk ete SS 
6,1'5,A $105 +s P 


which indicates, although no stress can be laid on actual numbers, that at about 
the fifth difference 1575 4, would tend to become negative. All we think it 
possible to say would be that if the time factor be eliminated there is very little 
positive organic association between high isolation-rate and high attack-rate to be 
cleared up,—certainly not more than is indicated by the correlation on p. 565: 


4s =+ 147 + ‘076, 


T4243 


* , , rele ti a 7 
It may be noted that Ts, 4'3,1' Was also found from a correlation table, but "3,1'8,A’ 98 having only 
76 cases by product moment without grouping. 


























Erne: M. Exvperton anp Kart PEARSON 567 


which seems to suggest that other things being constant increasing incidence is to 


some slight extent followed—probably as the only suggested remedy—by the 
higher isolation rates*. 


(9) Can any other Factors be determined which measure the Relation between 
urban conditions and the Incidence of Diphtheria? It is worth while from this 
standpoint to place the towns with which we have dealt in the order of incidence, 
each town being credited with the mean of the three attack-rates for each of 
three-year periods. Now an examination of the four columns of this table shows 
that, with the exception of Oxford—which has a child incidence (‘89 as com- 
pared to ‘70) considerably above the population incidence owing to relatively 
few children—the towns with the least diphtheria are the Midland, and parti- 
cularly the Northern manufacturing towns. These constitute practically the 
whole of the first column of 19 towns. The last column contains the big ports 
and certain suburban metropolitan districts, indeed all these for which we have 
data except Plymouth, Devonport and Tottenham fall into the second half of the 


Seventy-siz Towns in order of their Diphtheria Incidence Rates 1904-1912. 











West Bromwich (°40) | 20 Rotherham (91) | 39 Birkenhead (1:20) | 58 Brighton 
Northampton (45) | 21 South Shields (92) | 40 Rhondda (1:24) | 59 Stockton 
Wigan (48) | 22 Preston (93) | 41 Smethwick (1°25) | 60 Grimsby 
Walsall (49) | 23 Wallasey (94) | 42 Barrow (1°25) | 61 Leyton 
Stockport (53) | 24 Bath (95) | 43 Newport (1°25) | 62 West Ham 
Oldham (59) | 25 Bootle (96) | 44 Wimbledon (1:30) | 63 Salford 
Bolton (59) | 26 York (99) | 45 Great Yarmouth (1°31) | 64 Nottingham 
Oxford (70) | 27 Blackpool (99) | 46 Southend-on-Sea (1°32) | 65 St Helens 
Barnsley (71) | 28 Tynemouth (1°00) | 47 Birmingham (1°32) | 66 Walthamstow 
Southport (72) | 29 Tottenham (1°02) | 48 Gillingham (1°34) | 67 Ilford 
Rochdale (‘73) | 30 Halifax (1°03) | 49 Ipswich (1°36) | 68 Southampton 
Leicester (‘76) | 31 Sheffield (1°03) | 50 Liverpool (1°37) | 69 Cardiff 
Manchester (‘79) 32 Plymouth (1°07) | 51 Hornsey (1°39) | 70 Enfield 

Bury (80) | 33 Coventry (1°11) | 52 Darlington (1°39) | 71 Hull 
Blackburn (83) | 34 Warrington (1°11) | 53 Acton (1°43) | 72 Bristol 
Wolverhampton (‘86) | 35 Devonport (1°13) | 54 Newcastle (1°45) | 73 Croydon 
3urnley (86) | 36 Sunderland (1°14) | 55 Burtonon Trent (1°48) | 74 Portsmouth 
Huddersfield (89) | 37 Bournemouth (i'17) | 56 Bradford (1°56) | 75 Derby 
Wakefield (91) | 838 Middlesbrough (1°20) | 57 Willesden (1°56) 76 Lincoln 





* It is perhaps worth while putting on record the additional statistical constants obtained in 
deducing the above correlations, as they are probably fairly reliable values and should be compared 
with the two period constants on p. 561 : 


A’ =Mean Attack-rate 1-26 ; Standard Deviation, Attack-rate "655 
I’ =Mean Isolation-rate 47°75 ; Standard Deviation, Isolation-rate 26°341 
6,4’ =Mean Increase in Attack-rate —°086; Standard Deviation of change in Attack-rate “648 


6,4’=Mean Increase in Isolation-rate 9-03; Standard Deviation of Increase in Isolation-rate 1°05 


Thus while most towns have been sensibly increasing their amount of isolation by 17 °/, to 18 °/, of 
its mean value, the decrease in the attack-rate has only been 6°/, to 7°/, of the mean incidence, and 
the correlations show that this decrease has not occurred in the towns with marked increase of 
isolation. 











568 A Study of the Ejfects of Diphtheria Isolation 


table, and there can be little doubt that on the whole sea-port conditions and 
the big new neighbourhoods round London favour, while manufacturing con- 
ditions restrict, the incidence of diphtheria. We have not data, however, available 
upon which we could test water and milk supply, or extent of consumption of 
milk and fish in these towns. The results for Derby and Lincoln are remarkable, 
but they are high for all three periods, and this notwithstanding the rapid 
increase of isolation in those towns. 


At first sight it seemed to us that the towns in the first column were markedly 
those in which there had been a greatly restricted birthrate*, while those in the 
last column were towns of greater fertility. Taking the births per 100 married 
women from 15 to 45 (B) we found: 


rap =+ 013 + ‘075. 


Thus there is no association between incidence and the well-to-do character of a 
town as estimated by a low birthrate. 


Again having regard to the character of the towns in our first column, it 
occurred to us to test the incidence in relation to the employment of males in 
manufacturing processes involving smoke. We took out of the 1911 census the 
percentage (S) of males over 10 years of age, who fell under a rough test of 
smoke-producing occupations, namely 1x. 1, x. 1-2, 5-8, xiv. 1, Xv. and XVIII. 
1-6 of the Registrar-General’s classification, and we found : 


rag = — 180 + 073. 


This is possibly significant and would undoubtedly be emphasised had we 
included as a factor the women engaged in textile industries. There seems 
therefore some slight reason to suppose that the conditions favourable to smoke 
production are unfavourable to the spread of diphtheria. 


If the data could be procured, it would be worth while considering water and 
milk supply and the extent of fish consumption in the towns we have dealt with. 
If these were found to be of little influence, the road would certainly be clearer 
for dealing with the chronic diphtheritic human carrier as the chief source of 
the spread of the diphtheria bacillus. 


(10) Conclusions. 


(a) No influence of greater isolation in reducing the attack-rate from diph- 
theria is discoverable. In fact there is a sensible, if not large, positive association 
between the isolation-rate and attack-rate. 

(b) The case mortality is somewhat less where there is more isolation. This 


may very probably be accounted for by more cases coming under specialised medical 
care. 


* We had partially in view here also the possibility that restricted birthrate meant employment of 
women and thus less breast-feeding and greater use of milk, so that cross-currents might be at work. 











' 
' 
Ol] 
y 


rT ee a 


eer ers 


oe come oe 





oo ee were 


es 


————EE — 


ErHe. M. Evperton anp Karu PEARSON 569 


(c) The attack-rate appears to be greater in the more prosperous towns and 
in towns of somewhat better Sanitary conditions. We have not found the pre- 
valence of diphtheria associated with overcrowding or with the conditions leading 
to high infant mortality. 


(d) While a low birthrate, taken either as a measure of prosperity, or as 
a measure of the employment of women and so of the prevalence of hand feeding, 
appears to have no significance for the attack-rate of diphtheria, smoke-producing 
manufactures are probably unfavourable to the prevalence of the disease, which 
appears to attach itself in the main to the large ports and metropolitan suburban 
districts. 


(e) The association between the attack- and isolation-rates observed is not 
very significant, and while it might, to a very small extent, be due to increased 
isolation following or accompanying increased attack, it is more probably an 
association due to the more prosperous towns practising more isolation, and also 
to there being some element in prosperity which assists the spread of the disease. 


Generally all the correlations are of a low order; they contain, however, 
nothing to support the theory that isolation markedly limits the incidence of 
diphtheria ; the disease itself does not appear where overcrowding is greatest nor 
where the population is most dense ; on the other hand isolation is most practised 
in those towns where domestic servants are most common and which may be 
supposed to be most prosperous. The chief argument for isolation—which can be 
drawn from the present data—is a lessened case-mortality, but such mortality 
might be obtained in all probability by specialised medical service as apart from 
isolation. 








MISCELLANEA. 


I. On the Probable Error of a Coefficient of Mean Square 
Contingency. 


By KARL PEARSON, F.R.S. 


Ler the sampled population be considered as to two variates and be represented by the total 
M and the cell-frequency m pq for the pth row and gth column cell. Further let the vertical 
marginal frequencies be given by m,, and the horizontal marginal frequencies by m,,, so that 
My q+ Mog +... + Myg+..-=M.gy 
Mp + Myo + 26+ Myg +... = Mp,- 
Let the corresponding quantities for the sample be WV, npg, %.q and 2,.. 
Then we know that the mean square contingency ¢? is given by 
2 
m m = 
N 1-2 n 
(( M M = 


PS {cece eeeeeeeeeenteeeneenen (i), 


V x ym a My: 


HM M 





summed for every cell. 
Now in the great bulk of statistical phenomena we do not know more of the sampled popula- 


: : : ° e M .¢ Mr. 
tion than is given by the sample, and thus to determine ¢? we must put Ta and Vv equal to 


N.¢ n ° ° 
the most probable values known to us*, namely, Vv and v" Doing this we obtain the usual 


value for the mean square contingency 


27 
¢?=S {may - na) / N.qNp } cUegaeskabercedewseneeeese eoess 


Starting from (ii) Blakeman and Pearson have foundt+ the probable error of the mean square 
contingency ‘Ihe process is admittedly very laborious and although it has now been used fairly 
often, it must be confessed that its chief value is to obtain appreciation of the probable errors of 
contingency coefficients in general, rather than in any usefulness in recording significant differences 
between long series of individual coefficients. 

But it has not been sufficiently recognised that the probable error ti.us found is that of the 
approximate value of the mean square contingency (ii) and not that of its true value (i). It is 
indeed the probable error of the expression actually used, but it is not the probable error of the 
true value as given by (i). The latter is easy to find and deserves consideration. Let us write 


NY, 4 
N m a My ,|M?= Ppa 


‘ l (pq — %pq)”) (1 54) 
then 3 —. § {ea es | =. 1 ft 
" NV pq J NV (Bp) : 


* Pearson, Philosophical Magazine, Vol. u, 1900, pp. 164 et seq. 
+ Biometrika, Vol. v. pp. 191 et seq. 








OMAR, 








Miscellanea 571 


where we shall use ¢,? and ¢,” for the true and approximate values of the mean square con- 
tingency. Thus 


1+g2=4 8 (=) eat NOR Cie eee Git). 


Ppq 
Now for a sample of constant size p,, is constant and therefore representing small deviations 


by differentials 
2 Ng SN 
do2=— s( pq Ora \ 
N pq 
Square, add for all samples and divide by the number of such samples and we have 
4 n> 8 [Npq Mp'q’ 
2 ¥ Pq 9 pq' pq 
o~ 2=—,, 8 < ee + eS | a X On, Saye 'n Rpg}? 
oe N Prong Pa NW? \ woo tp'a’ pa’ ™p'a’” ®pa™p'g 
where o,,, is the standard deviation of n,, and 7, n,,, i8 the correlation of deviations in np, and 
Nyq ; Sis a Summation for every cell and = for every pair of cells. 
But it is well known* that 
° - Mopq / Mr 
oo, = KN 1-—# 
mM \ OM )* 


7 [nq My'q' 


vue” “ a*® 





Fipyg Trp) 


where x is the factor 1—(V-1)/(M-—1), usually put unity, since M is as a rule large compared 
with JV, and which will be here put unity for the remainder of the work. 


ie — “ie ss 
icon ot gr s Ce") ot J s ( : ~ x) = id S (Fee Ny'q mot) 
N , Mp? 4 N M? p54 N Pope -* 


4 N? 19 Mery 4 oo [MeaMe\i” me 
=—8§ ( nae) -=7;8 ( Fa mt) FS psopaUabasibe Cave eeasescstencaleeeree teenie (iv). 
A M ppg N | Mpyq /) 


This is the standard deviation of the true value of the mean square contingency, and in most 
cases will be of no service, for we do not know the true values of mMyg and ppg. 





If we put these equal to the values obtained from the actual sample under consideration we 
obtain the approximate value of the standard deviation of the true mean square contingency, 
which we may represent by the symbol (792) and compare with what Blakeman and Pearson 

- a 
found, ie. (7g 2). Thus our alternatives are 
x Pa t 
24-6744 : 
ha? + 67449 (oy2) 
and 2+ 67449 2),. 
Pa* + 674 (74,2) 

The real thing is pi? + 6744904 2. 

Shall we obtain a better insight into the variation of this by taking the approximate values of 
both #? and o,2, or by taking the probable error of @,2? The problem is a subtle one, and 

t dt , A 3 I ‘a 5] 
perhaps, only to be solved by experiment, not by theory. Of course when we take numerous 
samples and calculate ¢,?, then T}2 will measure their variability. But this is not what we seek. 

a 
We use ¢,? as an approximation te ¢,7, and it is the variability of the true value that we want. 
Are we not right in choosing (7g2) as its best value? In short would not—on the average of a 
be ba 7a 


great number of samples—(og2)_ give us a closer result to og 2 than (og 2),? 


Returning now to equation (iv) and putting in the observed values for mpg, ppg We have 


= 4 Nr 5, ) [s (ee Na m 
(r42)0= | 8 feet _ iy R.gMy, J ake ssn casecareenes teewenntnts 


* The values here given are the true values before we approximate by putting mpq/M=np,/N, ete. 
Biometrika x 73 











572 Miscellanea 


Or, after some reductions 


(o42),= Th ON SE, ae ORG Mea ey Oe SHS (vi), 


r(n,, —%:a%:\" 
where Ve=8 [a (0 7) iene nba: ae ae (vii) 


,  _. hqMp.\* 
and rer ( N yy OLE Sie RAN eSB (viii), 


N .qNp: | 
Again we have Ty2=2ioy, ; 
1 {va° o\4 . 
ee (74.), = JN \ba? oa 1 - Pa Pee e eee rrereeesesereseeesesees (ix). 
Now what we usually need is the probable error of the contingency coefficient 
C,=V ¢?/(1 +"). 
But TC,=%% a- 2)8 = og/(1 +¢?)! . 
Thus the probable error of the coefficient of mean square contingency 
‘67 9 3 2 l— 2) 4 
ny pene dace > thd Sed | A (x). 
aN | (+bay | 
This expression is much simpler than that for the probable error of the actually used value 
as given by Blakeman and Pearson*. It is not, however, asserted that it possesses greater 
theoretical validity. Those authors illustrate their formula by calculating the probable error of 
the contingency coefficient in the case of the association of handwriting and general intelligence 


in 1801 schoolgirls. They find 
C= ‘2957 + 0192. 


In the course of their work they deduce 

hq? = 09580, 

(o4,),= 03268, 

a 

Wa? = "14865. 
Using these values we have from (ix) 

1 (14865 )4 
=——— 4- - + ‘90420}~ = 03693. 
(¢0)e= Figor (09580 * 99?) 

It is clear therefore that (7%,), does not differ very substantially from (og ),. Calculating 
from (x) the probable error of C,, we find it=*0217, while the Blakeman-Pearson process gave 
0192. The two values only differ by 0025, which is unlikely to be of importance in the case of 
most inferences in practical statistics. 

Beyond the knowledge of ¢,? only w,3 is required by the present process. 


in N.qNy:\* . 2 .qNp:) 
N 





This may be written 





M.g™p: Nigh: 
N N 


In finding the mean square contingency ¢,?, however, the three expressions 


9 
(n Pa nae) 
Pq 7 ; 
NV ‘ _NigNp: ron N.qMp: 
’ Pq ‘f ” Vv 
N.gNp: N l 
N 


* loc, cit. p. 194. 








Miscellanea 573 


must have been written down for each cell and thus y,° can be readily calculated. We can also 
treat y,° as given by 


° 


3 
vias {rem} 13g, 


(2 .qMp:)* 


but the cubing of the often rather large cell frequencies is troublesome, just as it is rather more 


troublesome to calculate 
2g (_%*m _\_ 
o2=S8 (==) 1 


( 2 .gMy:\* 

Rpg — =a 2) 

ea | 

than da =7 8 ia : 
N 


owing to the largeness of the squares in the former expression. 


II. Measurements of Medieval English Femora. 


In a forthcoming memoir on the English Long bones there will be a good deal to be said 
about the conclusions reached by Dr Parsons in his recent paper on the Rothwell femora. 
Meanwhile he has started an attack on the Biometric School in a Journal whose columns are 
not open to adequate reply,—i.e. to a reply of not greater length than the published attack— 
from members of that school. In his communication he suggested that I was unacquainted with 
the condition of affairs at Rothwell, and behind this charge tried to escape any answers to 
the essential questions I asked him, and thus those questions still remain unanswered. 


The communication I made ran as follows : 


My informant who I hope is trustworthy speaks of (i) “the great mass of bones beneath the 
church at Rothwell” and (ii) of “the great collection of human bones beneath the old parish 
church at Rothwell” ; further (iii) “there are probably some 5000 or 6000 individuals represented 
in the vault at Rothwell, either altogether or in part”; and again (iv) “The stack varies in 
height and breadth, but is nowhere as high or broad as that at Hythe, although it is much 
longer. I know that at Hythe there are the remains of rather over 4000 people,....... I think 
that this collection contains more than this, partly because the stack is so much longer, partly 
because the bones are so much more decomposed and have therefore settled more.” 


Manouvrier after much piecing and mending while only able to measure the lengths of about 
16 femora from the zeolithic burial places of Montigny and Esbly, was yet able to determine the 
pilastric index of 127, and the platymeric index in 127 bones, that is to say in eight times as 
many bones as those for which he could obtain the maximum length. And had he dealt fully 
with the head and neck and the popliteal region, the multiplying factor would probably have 
been ten. Had piecing, mending and a maximum of care in handling been used, [ can hardly 
believe that what Manouvrier achieved at Montigny was not possible for Dr Parsons at Rothwell. 


Dr Parsons writes: “If the remains of femurs, whether they are fit or unfit for measurement, 
are counted it will be found that females are quite as numerous as males though measurable 
male femurs from their stronger build are less liable to break in being extricated from the pile of 
bones, and so there are more of them available for measurement.” The italics are mine. 

Much depends on the method of ‘extrication,’ and if the capacity of a bone to stand a hole 
being drilled in it with a bradawl be part of the necessary fitness for measurement then the 
number might undoubtedly be limited. But trusting to what I know has been achieved by the 





574 Miscellanea 


French, I feel convinced that if Dr Parsons could measure 277 femoral heads where the femoral 
length was measurable, he could easily have measured 2000 heads in all and thus have ascer- 
tained, definitely, whether his Rothwell series is unique in showing a significant depression in 
frequency between 45 and 47mm. Further he could on such material by dealing with numbers 
8 to 10 times those he has provided have given definite answers to many of the problems con- 
cerning platymery and the pilastric and popliteal indices, which other observers have been vainly 
trying to solve on far less adequate and in many cases far more fragmentary material. 


I would aote that Dr Parsons gives no reply at all to my question of why he used Dwight’s 
measurements as a criterion of sex when they referred to bones with the cartilages attached, 
because without this reply his careful attention to ‘other points’ when the head fell between 
45 and 47 mm. seems one-sided, and of no value in sexing the collection as a whole. He 
further gives no reply whatever to my question of why it is the male end, not the dwarf end, 
of his female distribution which is lacking, if absence of females be due to breakage. 


I would also state (i) that I have not sexed the Rothwell bones and therefore cannot say how 
far I should or should not agree with Dr Parsons. Dr Lee using the best available mathematical 
process found 145 2s and 133 ¢s, while Dr Parsons has 1039s and 174¢s. How this shows any 
agreement I fail to perceive ; (ii) that I have made no assertion about the bones being of the 
13th and 14th centuries. I merely headed my letter with Dr Parsons’ heading “ Measurements 
of Medieval English Femora,” and asked why, if Dr Parsons holds these bones to be such, he 
considers them without cartilages comparable with the mixed results of a modern American 
dissecting room plus the cartilages. 


| oe 





CAMBRIDGE: PRINTED BY JOHN CLAY, M.A. AT THE UNIVERSITY PRESS 





A 


